Dissertations / Theses: 'Fuzzy clusters'

1

Vargas, Rogerio Rodrigues de. "Uma nova forma de calcular os centros dos Clusters em algoritmos de agrupamento tipo fuzzy c-means." Universidade Federal do Rio Grande do Norte, 2012. http://repositorio.ufrn.br:8080/jspui/handle/123456789/17949.

Full text

Abstract:

Made available in DSpace on 2014-12-17T15:47:00Z (GMT). No. of bitstreams: 1 RogerioRV_TESE.pdf: 769325 bytes, checksum: ddaac964e1c74fba3533b5cdd90927b2 (MD5) Previous issue date: 2012-03-30
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
Clustering data is a very important task in data mining, image processing and pattern recognition problems. One of the most popular clustering algorithms is the Fuzzy C-Means (FCM). This thesis proposes to implement a new way of calculating the cluster centers in the procedure of FCM algorithm which are called ckMeans, and in some variants of FCM, in particular, here we apply it for those variants that use other distances. The goal of this change is to reduce the number of iterations and processing time of these algorithms without affecting the quality of the partition, or even to improve the number of correct classifications in some cases. Also, we developed an algorithm based on ckMeans to manipulate interval data considering interval membership degrees. This algorithm allows the representation of data without converting interval data into punctual ones, as it happens to other extensions of FCM that deal with interval data. In order to validate the proposed methodologies it was made a comparison between a clustering for ckMeans, K-Means and FCM algorithms (since the algorithm proposed in this paper to calculate the centers is similar to the K-Means) considering three different distances. We used several known databases. In this case, the results of Interval ckMeans were compared with the results of other clustering algorithms when applied to an interval database with minimum and maximum temperature of the month for a given year, referring to 37 cities distributed across continents
Agrupar dados ? uma tarefa muito importante em minera??o de dados, processamento de imagens e em problemas de reconhecimento de padr?es. Um dos algoritmos de agrupamentos mais popular ? o Fuzzy C-Means (FCM). Esta tese prop?e aplicar uma nova forma de calcular os centros dos clusters no algoritmo FCM, que denominamos de ckMeans, e que pode ser tamb?m aplicada em algumas variantes do FCM, em particular aqui aplicamos naquelas variantes que usam outras dist?ncias. Com essa modifica??o, pretende-se reduzir o n?mero de itera??es e o tempo de processamento desses algoritmos sem afetar a qualidade da parti??o ou at? melhorar o n?mero de classifica??es corretas em alguns casos. Tamb?m, desenvolveu-se um algoritmo baseado no ckMeans para manipular dados intervalares considerando graus de pertin?ncia intervalares. Este algoritmo possibilita a representa??o dos dados sem convers?o dos dados intervalares para pontuais, como ocorre com outras extens?es do FCM que lidam com dados intervalares. Para validar com as metodologias propostas, comparou-se o agrupamento ckMeans com os algoritmos K-Means (pois o algoritmo proposto neste trabalho para c?lculo dos centros se assemelha ? do K-Means) e FCM, considerando tr?s dist?ncias diferentes. Foram utilizadas v?rias bases de dados conhecidas. No caso, os resultados do ckMeans intervalar, foram comparadas com outros algoritmos de agrupamento intervalar quando aplicadas a uma base de dados intervalar com a temperatura m?nima e m?xima do m?s de um determinado ano, referente a 37 cidades distribu?das entre os continentes

APA, Harvard, Vancouver, ISO, and other styles

2

Frigui, Hichem. "New approaches for robust clustering and for estimating the optimal number of clusters /." free to MU campus, to others for purchase, 1997. http://wwwlib.umi.com/cr/mo/fullcit?p9842528.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Felizardo, Rui Miguel Meireles. "A study on parallel versus sequential relational fuzzy clustering methods." Master's thesis, Faculdade de Ciências e Tecnologia, 2011. http://hdl.handle.net/10362/5663.

Full text

Abstract:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Relational Fuzzy Clustering is a recent growing area of study. New algorithms have been developed,as FastMap Fuzzy c-Means (FMFCM) and the Fuzzy Additive Spectral Clustering Method(FADDIS), for which it had been obtained interesting experimental results in the corresponding founding works. Since these algorithms are new in the context of the Fuzzy Relational clustering community, not many experimental studies are available. This thesis comes in response to the need of further investigation on these algorithms, concerning a comparative experimental study from the two families of algorithms: the parallel and the sequential versions. These two families of algorithms differ in the way they cluster data. Parallel versions extract clusters simultaneously from data and need the number of clusters as an input parameter of the algorithms, while the sequential versions extract clusters one-by-one until a stop condition is verified, being the number of clusters a natural output of the algorithm. The algorithms are studied in their effectiveness on retrieving good cluster structures by analysing the quality of the partitions as well as the determination of the number of clusters by applying several validation measures. An extensive simulation study has been conducted over two data generators specifically constructed for the algorithms under study, in particular to study their robustness for data with noise. Results with benchmark real data are also discussed. Particular attention is made on the most adequate pre-processing on relational data, in particular on the pseudo-inverse Laplacian transformation.

APA, Harvard, Vancouver, ISO, and other styles

4

Garcia, Ian. "Eliminating Redundant and Less-informative RSS News Articles Based on Word Similarity and A Fuzzy Equivalence Relation." Diss., CLICK HERE for online access, 2007. http://contentdm.lib.byu.edu/ETD/image/etd1688.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Franco, Pedro Guerra de Almeida. "Fuzzy clustering não supervisionado na detecção automática de regiões de upwelling a partir de mapas de temperatura da superfície oceânica." Master's thesis, FCT - UNL, 2009. http://hdl.handle.net/10362/2383.

Full text

Abstract:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
O afloramento costeiro (upwelling) ao largo da costa de Portugal Continental é um fenómeno bem estudado na literatura oceanográfica. No entanto, existem poucos trabalhos na literatura científica sobre a sua detecção automática, em particular utilizando técnicas de clustering. Algoritmos de agrupamento difuso (fuzzy clustering) têm sido bastante explorados na área de detecção remota e segmentação de imagem, e investigação recente mostrou que essas técnicas conseguem resultados promissores na detecção do upwelling a partir de mapas de temperatura da superfície do oceano, obtidos por imagens de satélite. No trabalho a desenvolver nesta dissertação, propõe-se definir um método que consiga identificar automaticamente a região que define o fenómeno. Como objecto de estudo, foram analisados dois conjuntos independentes de mapas de temperatura, num total de 61 mapas, cobrindo a diversidade de cenários em que o upwelling ocorre. Focando o domínio do problema, foi desenvolvido trabalho de pesquisa bibliográfica ao nível de literatura de referência e estudos mais recentes, principalmente sobre os temas de técnicas de agrupamento, agrupamento difuso e a sua aplicação à segmentação de imagem. Com base num dos algoritmos com mais influência na literatura, o Fuzzy c-means (FCM), foi desenvolvida uma nova abordagem, utilizando o método de inicialização ‘Anomalous Pattern’, que tenta resolver dois problemas base do FCM: a validação do melhor número de clusters e a dependência da inicialização aleatória. Após um estudo das condições de paragem do novo algoritmo, AP-FCM, estabeleceu-se uma parametrização que determina automaticamente um bom número de clusters. Análise aos resultados obtidos mostra que as segmentações geradas são de qualidade elevada, reproduzindo fidedignamente as estruturas presentes nos mapas originais, e que, computacionalmente, o AP-FCM é mais eficiente que o FCM. Foi ainda implementado um outro algoritmo, com base numa técnica de Histogram Thresholding, que, obtendo também boas segmentações, não permite uma parametrização para a definição automática do número de grupos. A partir das segmentações obtidas, foi desenvolvido um módulo de definição de features, a partir das quais se criou um critério composto que permite a identificação automática do cluster que delimita a região de upwelling.

APA, Harvard, Vancouver, ISO, and other styles

6

Dimitriadou, Evgenia, Andreas Weingessel, and Kurt Hornik. "Fuzzy voting in clustering." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/742/1/document.pdf.

Full text

Abstract:

In this paper we present a fuzzy voting scheme for cluster algorithms. This fuzzy voting method allows us to combine several runs of cluster algorithms resulting in a common fuzzy partition. This helps us to overcome instabilities of the cluster algorithms and results in a better clustering.
Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"

APA, Harvard, Vancouver, ISO, and other styles

7

Hammah, Reginald Edmund. "Intelligent delineation of rock discontinuity data using fuzzy cluster analysis." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0012/NQ41436.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Timm, Heiko. "Fuzzy-Clusteranalyse Methoden zur Exploration von Daten mit fehlenden Werten sowie klassifizierten Daten /." [S.l. : s.n.], 2002. http://deposit.ddb.de/cgi-bin/dokserv?idn=965011097.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Pangaonkar, Manali. "Exploratory Study of Fuzzy Clustering and Set-Distance Based Validation Indexes." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1353342433.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Stetco, Adrian. "An investigation into fuzzy clustering quality and speed : fuzzy C-means with effective seeding." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/an-investigation-into-fuzzy-clustering-quality-and-speed-fuzzy-cmeans-with-effective-seeding(fac3eab2-919a-436c-ae9b-1109b11c1cc2).html.

Full text

Abstract:

Cluster analysis, the automatic procedure by which large data sets can be split into similar groups of objects (clusters), has innumerable applications in a wide range of problem domains. Improvements in clustering quality (as captured by internal validation indexes) and speed (number of iterations until cost function convergence), the main focus of this work, have many desirable consequences. They can result, for example, in faster and more precise detection of illness onset based on symptoms or it could provide investors with a rapid detection and visualization of patterns in financial time series and so on. Partitional clustering, one of the most popular ways of doing cluster analysis, can be classified into two main categories: hard (where the clusters discovered are disjoint) and soft (also known as fuzzy; clusters are non-disjoint, or overlapping). In this work we consider how improvements in the speed and solution quality of the soft partitional clustering algorithm Fuzzy C-means (FCM) can be achieved through more careful and informed initialization based on data content. By carefully selecting the cluster centers in a way which disperses the initial cluster centers through the data space, the resulting FCM++ approach samples starting cluster centers during the initialization phase. The cluster centers are well spread in the input space, resulting in both faster convergence times and higher quality solutions. Moreover, we allow the user to specify a parameter indicating how far and apart the cluster centers should be picked in the dataspace right at the beginning of the clustering procedure. We show FCM++'s superior behaviour in both convergence times and quality compared with existing methods, on a wide rangeof artificially generated and real data sets. We consider a case study where we propose a methodology based on FCM++for pattern discovery on synthetic and real world time series data. We discuss a method to utilize both Pearson correlation and Multi-Dimensional Scaling in order to reduce data dimensionality, remove noise and make the dataset easier to interpret and analyse. We show that by using FCM++ we can make an positive impact on the quality (with the Xie Beni index being lower in nine out of ten cases for FCM++) and speed (with on average 6.3 iterations compared with 22.6 iterations) when trying to cluster these lower dimensional, noise reduced, representations of the time series. This methodology provides a clearer picture of the cluster analysis results and helps in detecting similarly behaving time series which could otherwise come from any domain. Further, we investigate the use of Spherical Fuzzy C-Means (SFCM) with the seeding mechanism used for FCM++ on news text data retrieved from a popular British newspaper. The methodology allows us to visualize and group hundreds of news articles based on the topics discussed within. The positive impact made by SFCM++ translates into a faster process (with on average 12.2 iterations compared with the 16.8 needed by the standard SFCM) and a higher quality solution (with the Xie Beni being lower for SFCM++ in seven out of every ten runs).

APA, Harvard, Vancouver, ISO, and other styles

11

Simões, Rodrigo Ferreira. "Localização industrial e relações intersetoriais : uma analise de "fuzzy cluster" para Minas Gerais." [s.n.], 2003. http://repositorio.unicamp.br/jspui/handle/REPOSIP/285880.

Full text

Abstract:

Orientador : Angela Antonia Kageyama
Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Economia
Made available in DSpace on 2018-08-03T14:28:24Z (GMT). No. of bitstreams: 1 Simoes_RodrigoFerreira_D.pdf: 834850 bytes, checksum: aeaf6b5c0e31d70388a223ad3fc323d4 (MD5) Previous issue date: 2003
Doutorado

APA, Harvard, Vancouver, ISO, and other styles

12

Zubková, Kateřina. "Text mining se zaměřením na shlukovací a fuzzy shlukovací metody." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2018. http://www.nusl.cz/ntk/nusl-382412.

Full text

Abstract:

This thesis is focused on cluster analysis in the field of text mining and its application to real data. The aim of the thesis is to find suitable categories (clusters) in the transcribed calls recorded in the contact center of Česká pojišťovna a.s. by transferring these textual documents into the vector space using basic text mining methods and the implemented clustering algorithms. From the formal point of view, the thesis contains a description of preprocessing and representation of textual data, a description of several common clustering methods, cluster validation, and the application itself.

APA, Harvard, Vancouver, ISO, and other styles

13

Brož, Zdeněk. "Fuzzy hodnocení investic - brownfield redevelopment." Doctoral thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2013. http://www.nusl.cz/ntk/nusl-233755.

Full text

Abstract:

Tato disertační práce se zaměřuje na problematiku investování a podporu rozhodování pomocí moderních metod. Zejména pokud jde o analýzu, hodnocení a výběr tzv. brownfieldů pro jejich redevelopment (revitalizaci). Cílem této práce je navrhnout univerzální metodu, která usnadní rozhodovací proces. Proces rozhodování je v praxi komplikován též velkým počet relevantních parametrů ovlivňujících konečné rozhodnutí. Navržená metoda je založena na využití fuzzy logiky, modelování, statistické analýzy, shlukové analýzy, teorie grafů a na sofistikovaných metodách sběru a zpracování informací. Nová metoda umožňuje zefektivnit proces analýzy a porovnávání alternativních investic a přesněji zpracovat velký objem informací. Ve výsledku tak bude zmenšen počet prvků množiny nejvhodnějších alternativních investic na základě hierarchie parametrů stanovených investorem.

APA, Harvard, Vancouver, ISO, and other styles

14

Bank, Mathias. "AIM - A Social Media Monitoring System for Quality Engineering." Doctoral thesis, Universitätsbibliothek Leipzig, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-115894.

Full text

Abstract:

In the last few years the World Wide Web has dramatically changed the way people are communicating with each other. The growing availability of Social Media Systems like Internet fora, weblogs and social networks ensure that the Internet is today, what it was originally designed for: A technical platform in which all users are able to interact with each other. Nowadays, there are billions of user comments available discussing all aspects of life and the data source is still growing. This thesis investigates, whether it is possible to use this growing amount of freely provided user comments to extract quality related information. The concept is based on the observation that customers are not only posting marketing relevant information. They also publish product oriented content including positive and negative experiences. It is assumed that this information represents a valuable data source for quality analyses: The original voices of the customers promise to specify a more exact and more concrete definition of \"quality\" than the one that is available to manufacturers or market researchers today. However, the huge amount of unstructured user comments makes their evaluation very complex. It is impossible for an analysis protagonist to manually investigate the provided customer feedback. Therefore, Social Media specific algorithms have to be developed to collect, pre-process and finally analyze the data. This has been done by the Social Media monitoring system AIM (Automotive Internet Mining) that is the subject of this thesis. It investigates how manufacturers, products, product features and related opinions are discussed in order to estimate the overall product quality from the customers\\\' point of view. AIM is able to track different types of data sources using a flexible multi-agent based crawler architecture. In contrast to classical web crawlers, the multi-agent based crawler supports individual crawling policies to minimize the download of irrelevant web pages. In addition, an unsupervised wrapper induction algorithm is introduced to automatically generate content extraction parameters which are specific for the crawled Social Media systems. The extracted user comments are analyzed by different content analysis algorithms to gain a deeper insight into the discussed topics and opinions. Hereby, three different topic types are supported depending on the analysis needs. * The creation of highly reliable analysis results is realized by using a special context-aware taxonomy-based classification system. * Fast ad-hoc analyses are applied on top of classical fulltext search capabilities. * Finally, AIM supports the detection of blind-spots by using a new fuzzified hierarchical clustering algorithm. It generates topical clusters while supporting multiple topics within each user comment. All three topic types are treated in a unified way to enable an analysis protagonist to apply all methods simultaneously and in exchange. The systematically processed user comments are visualized within an easy and flexible interactive analysis frontend. Special abstraction techniques support the investigation of thousands of user comments with minimal time efforts. Hereby, specifically created indices show the relevancy and customer satisfaction of a given topic
In den letzten Jahren hat sich das World Wide Web dramatisch verändert. War es vor einigen Jahren noch primär eine Informationsquelle, in der ein kleiner Anteil der Nutzer Inhalte veröffentlichen konnte, so hat sich daraus eine Kommunikationsplattform entwickelt, in der jeder Nutzer aktiv teilnehmen kann. Die dadurch enstehende Datenmenge behandelt jeden Aspekt des täglichen Lebens. So auch Qualitätsthemen. Die Analyse der Daten verspricht Qualitätssicherungsmaßnahmen deutlich zu verbessern. Es können dadurch Themen behandelt werden, die mit klassischen Sensoren schwer zu messen sind. Die systematische und reproduzierbare Analyse von benutzergenerierten Daten erfordert jedoch die Anpassung bestehender Tools sowie die Entwicklung neuer Social-Media spezifischer Algorithmen. Diese Arbeit schafft hierfür ein völlig neues Social Media Monitoring-System, mit dessen Hilfe ein Analyst tausende Benutzerbeiträge mit minimaler Zeitanforderung analysieren kann. Die Anwendung des Systems hat einige Vorteile aufgezeigt, die es ermöglichen, die kundengetriebene Definition von \"Qualität\" zu erkennen

APA, Harvard, Vancouver, ISO, and other styles

15

Kanade, Parag M. "Fuzzy ants as a clustering concept." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000397.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Camara, Assa. "Využití fuzzy množin ve shlukové analýze se zaměřením na metodu Fuzzy C-means Clustering." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2020. http://www.nusl.cz/ntk/nusl-417051.

Full text

Abstract:

This master thesis deals with cluster analysis, more specifically with clustering methods that use fuzzy sets. Basic clustering algorithms and necessary multivariate transformations are described in the first chapter. In the practical part, which is in the third chapter we apply fuzzy c-means clustering and k-means clustering on real data. Data used for clustering are the inputs of chemical transport model CMAQ. Model CMAQ is used to approximate concentration of air pollutants in the atmosphere. To the data we will apply two different clustering methods. We have used two different methods to select optimal weighting exponent to find data structure in our data. We have compared all 3 created data structures. The structures resembled each other but with fuzzy c-means clustering, one of the clusters did not resemble any of the clustering inputs. The end of the third chapter is dedicated to an attempt to find a regression model that finds the relationship between inputs and outputs of model CMAQ.

APA, Harvard, Vancouver, ISO, and other styles

17

Hore, Prodip. "Scalable frameworks and algorithms for cluster ensembles and clustering data streams." [Tampa, Fla.] : University of South Florida, 2007. http://purl.fcla.edu/usf/dc/et/SFE0002135.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Rawashdeh, Mohammad Y. "A Relational Framework for Clustering and Cluster Validity and the Generalization of the Silhouette Measure." Thesis, University of Cincinnati, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3625824.

Full text

Abstract:

By clustering one seeks to partition a given set of points into a number of clusters such that points in the same cluster are similar and are dissimilar to points in other clusters. In the virtue of this goal, data of relational nature become typical for clustering. The similarity and dissimilarity relations between the data points are supposed to be the nuts and bolts for cluster formation. Thus, the task is driven by the notion of similarity between the data points. In practice, the similarity is usually measured by the pairwise distances between the data points. Indeed, the objective function of the two widely used clustering algorithms, namely, k-means and fuzzy c-means, appears in terms of the pairwise distances between the data points.

The clustering task is complicated by the choice of the distance measure and estimating the number of clusters. Fuzzy c-means is convenient when there are uncertainties in allocating points, in overlapping areas, to clusters. The k-means algorithm allocates the points unequivocally to clusters; overlooking the similarities between those points in overlapping areas. The fuzzy approach allows a point to be a member in as many clusters as necessary; thus it provides better insight into the relations between the points in overlapping areas.

In this thesis we develop a relational framework that is inspired by the silhouette measure of clustering quality. The framework asserts the relations between the data points by means of logical reasoning with the cluster membership values. The original description of computing the silhouettes is limited to crisp partitions. A natural generalization of silhouettes, to fuzzy partitions is given within our framework. Moreover, two notions of silhouettes emerge within the framework at different levels of granularity, namely, point-wise silhouette and center-wise silhouette. Now by the generalization, each silhouette is capable of measuring the extent to which a crisp, or fuzzy, partition has fulfilled the clustering goal at the level of the individual points, or cluster centers. The partitions are evaluated by the silhouette measure in conjunction with point-to-point or center-to-point distances.

By the generalization, the average silhouette value becomes a reasonable device for selecting between crisp and fuzzy partitions of the same data set. Accordingly, one can find about which partition is better in representing the relations between the data points, in accordance with their pairwise distances. Such powerful feature of the generalized silhouettes has exposed a problem with the partitions generated by fuzzy c-means. We have observed that defuzzifying the fuzzy c-means partitions always improves the overall representation of the relations between the data points. This is due to the inconsistency between some of the membership values and the distances between the data points. This inconsistency was reported, by others, in a couple of occasions in real life applications.

Finally, we present an experiment that demonstrates a successful application of the generalized silhouette measure in feature selection for highly imbalanced classification. A significant improvement in the classification for a real data set has resulted from a significant reduction in the number of features.

APA, Harvard, Vancouver, ISO, and other styles

19

Wedding, Donald K. "Extending the data mining software packages SAS Enterprise Miner and SPSS Clementine to handle fuzzy cluster membership : implementation with examples /." Abstract Full Text (PDF), 2009. http://eprints.ccsu.edu/archive/00000553/02/1997FT.pdf.

Full text

Abstract:

Thesis (M.S.) -- Central Connecticut State University, 2009.
Thesis advisor: Roger Bilisoly. "... in partial fulfillment of the requirements for the degree of Master of Science in Data Mining." Includes bibliographical references (leaves 119-124). Also available via the World Wide Web.

APA, Harvard, Vancouver, ISO, and other styles

20

Quinteiro, José António Teixeira. "Segmentação de individuos no Facebook que gostam de música: abordagem exploratória, recorrendo à comparação entre dois algoritmos, k-means e fuzzy c-means." Master's thesis, Instituto Superior de Economia e Gestão, 2011. http://hdl.handle.net/10400.5/4338.

Full text

Abstract:

Mestrado em Gestão/MBA
Para se poder definir os melhores planos estratégicos, as decisões de marketing que se têm que tomar, com o intuito de abordar o mercado, escolher a melhor campanha publicitária, seleccionar o segmento e o tipo de produto ou serviço a oferecer, têm que ter por base o resultado de uma boa análise técnica da informação ou dos dados disponíveis. A escolha do método de segmentação, é de primordial importância, pois os dados que se obtêm podem alterar a estratégia de selecção do mercado alvo e a estratégia de posicionamento dos produtos ou serviços, para além dos custos inerentes á tomada da decisão. Este estudo procura encontrar diferenças entre dois métodos de segmentação descritivos post-hoc, (k-means e Fuzzy C-Means), na obtenção dos clusters, tendo por base a população portuguesa que gosta de música e que tem conta activa no Facebook. No âmbito deste trabalho realizou-se uma revisão da literatura conhecida tendo-se efectuado a segmentação da amostra obtida através de dois algoritmos. Complementou-se o estudo com uma análise descritiva das frequências de modo, aquisição e audição dos vários tipos de música.
In order to define the best strategic plans, marketing decisions that have to be taken in order to tackle the market, choose the best advertising campaign, select the thread and the type of product or service to offer, they have to be based on the result of a good technical analysis of available data or information. The choice of segmentation method is of paramount importance, since the data obtained may change the target market selection and the strategy of placement of products or services, in addition to the costs related to taking the decision. This study seeks to find differences between two methods of descriptive post-hoc segmentation (k-means clustering and Fuzzy C-Means clustering), in obtaining of clusters, based on the Portuguese population who likes music and have an active account on Facebook. This work there was a review of the literature known followed by the segmentation of the sample obtained through two algorithms. These were complemented with a descriptive analysis of usage situations, acquisition and hearing of various types of music.

APA, Harvard, Vancouver, ISO, and other styles

21

Pereira, Ana Paula de Jesus Tomé. "Modelo de suporte à tomada de decisão sobre de acidentes de trânsito com vítimas baseado em lógica fuzzy." Universidade Federal da Paraíba, 2013. http://tede.biblioteca.ufpb.br:8080/handle/tede/6544.

Full text

Abstract:

Made available in DSpace on 2015-05-14T12:47:15Z (GMT). No. of bitstreams: 1 ArquivoTotalAnaPaula.pdf: 4539714 bytes, checksum: e81023113c80e20aab9cc31359a349d7 (MD5) Previous issue date: 2013-08-27
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
Traffic accidents represent, in Brazil, a serious economic and especially social, relevant for magnitude of the mortality and number of people suffering from sequelae arising, thus becoming a serious public health problem. This research aimed to develop a model to support decision making based on fuzzy logic, supported by analyzes spatial and spatio-temporal (Scan method) to categorize neighborhoods according to priority intervention for prevention and control of traffic accidents that produce victims. Secondary data were georeferenced and recorded by Mobile Emergency Care Service in João Pessoa, Paraíba, in the years 2010 and 2011. Throughout study period, João Pessoa was 10,070 traffic accidents with victims. Of this total, 17.8% had breath ethanol and 0.8% died at the scene. The majority of victims were male (74.5%), belonging to the age group 20-29 years (37.7%). The accidents occurred mainly on Sundays (19.2%), Saturdays (18.7%) and on Fridays (14.4%) as well as in the months of December (10%), October (9.8% ) and May (8.9%). Most of the vehicles involved was composed by motorcycles (68.1%) and cars (36.5%). The nature of accident, collision was more frequent (46.2%), followed by fall motorcycle (30.7%) and pedestrian injuries (11.1%). In analysis of the relative risk and spatial distribution of these events, it was found that neighborhoods with high relative risk and formed significant spatial clusters concentrated in the north, northwest and northeast of the municipality. We identified 15 clusters space-time, which concentrated mainly in the northern, northeastern and coastal strip of the municipality. It was observed that neighborhoods reported by Mobile Emergency Care Service were categorized as priority by model, Valentina and Mandacaru were categorized as with tendency to priority, and Mangabeira was categorized as non-priority. The proposed decision model showed good agreement when compared with Mobile Emergency Care Service, thus satisfying the identification and classification of neighborhoods as a priority, with tendency to priority, with tendency to non-priority and non-priority. The results may be of relevance to both Mobile Emergency Care Service as to other public officials linked to road traffic, traffic education and care for victims produced by road traffic in João Pessoa.
Os acidentes de trânsito representam, no Brasil, um grave problema econômico e principalmente social, relevante pela magnitude da mortalidade e do número de pessoas portadoras de sequelas decorrentes, tornando-se assim um grave problema de saúde pública. Este trabalho objetivou elaborar um modelo de apoio à tomada de decisão baseado em lógica fuzzy, apoiado pelas análises espacial e espaço-temporal (método Scan), para categorizar os bairros de acordo com o grau de prioridade de intervenção para a prevenção e combate dos acidentes de trânsito que produzam vítimas. Foram utilizados dados secundários georreferenciados e registrados pelo Serviço de Atendimento Móvel de Urgência na cidade de João Pessoa, Paraíba, nos anos 2010 e 2011. Ao longo do período de estudo, João Pessoa apresentou 10.070 ocorrências de AT com vítimas. Deste total, 17,8% apresentaram hálito etílico e 0,8% morreram no local do acidente. A maioria das vítimas foi do sexo masculino (74,5%), pertencente à faixa etária de 20 a 29 anos (37,7%). Os acidentes ocorreram principalmente aos domingos (19,2%), aos sábados (18,7%) e às sextas-feiras (14,4%), bem como nos meses de dezembro (10%), outubro (9,8%) e maio (8,9%). A maioria dos veículos envolvidos foi composta por motocicletas (68,1%) e carros (36,5%). Quanto à natureza do acidente, a colisão foi mais frequente (46,2%), seguida por queda de motocicleta (30,7%) e atropelamento (11,1%). Na análise do risco relativo e da distribuição espacial destes eventos, verificou-se que os bairros com alto risco relativo e que formaram conglomerados espaciais significativos concentraram-se nas regiões norte, noroeste e nordeste do município. Foram identificados 15 conglomerados espaço-temporais, que se concentraram principalmente nas regiões norte, nordeste e faixa litorânea do município. Observou-se que os bairros relatados pelo SAMU/JP foram categorizados pelo modelo como prioritários, Mandacaru e Valentina, os quais foram categorizados como com tendência a prioritários, e Mangabeira, categorizado como não prioritário. O modelo de decisão proposto apresentou boa concordância quando comparado com o SAMU/JP, sendo assim satisfatório na identificação e classificação dos bairros como prioritários, com tendência a prioritários, com tendência a não prioritários e não prioritários. Os resultados desta pesquisa podem ser de relevância tanto para o SAMU/JP quanto para outros órgãos gestores públicos ligados ao trânsito, educação para o trânsito e atendimento às vítimas produzidas pelo trânsito no município de João Pessoa-PB.

APA, Harvard, Vancouver, ISO, and other styles

22

Silva, Ana Claudia Guedes. "Identificação de regiões hidrologicamente homogêneas por agrupamento fuzzy c-means no estado do Paraná." Universidade Estadual do Oeste do Paraná, 2018. http://tede.unioeste.br/handle/tede/3760.

Full text

Abstract:

Submitted by Neusa Fagundes (neusa.fagundes@unioeste.br) on 2018-06-15T17:07:21Z No. of bitstreams: 2 Ana Claudia_Silva2018.pdf: 1741410 bytes, checksum: 83384ab7c02835c3d776f862defc84c1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Made available in DSpace on 2018-06-15T17:07:21Z (GMT). No. of bitstreams: 2 Ana Claudia_Silva2018.pdf: 1741410 bytes, checksum: 83384ab7c02835c3d776f862defc84c1 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-02-07
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
The design of hydrologically homogeneous regions (RHH) is an essential procedure to provide information essential to the modeling, planning, and management of water resources, especially when it is necessary to perform the regionalization of flows, aiming to estimate the water availability in sections without measurements. The definition of strategies for the management and conservation of natural resources depends on information obtained through the identification of RHH, also being one of the steps of a study of regionalization of flows. Thus, this work has the objective of identifying the RHH in the state of Paraná through the grouping method Fuzzy C-Means. A total of 9 variables were used for the 114 fluviometric stations, with 4 dependent variables related to the characteristic flows (annual average long-term flow (Qmld), minimum annual flow with seven days duration and 10-year return period (Q7,10), flow rates associated to the 95% (Q95) and 90% (Q90) permanencies) and 5 independent variables related to the morphometric characteristics of the station (drainage area (AD - m²), sum of drainage (SD - m) (LA - Lat and longitude - Long). From the principal components analysis (PCA), the variables Qmld, DD, Lat and Long were identified as the least representative, being discarded from the study, proceeding with the analysis using only the variables AD, SD, Q90, Q95, and Q7,10. The results were obtained using the Fuzzy C-Means for the chosen variables, and the smallest objective function was found for 4 Clusters in the study group, with index of and fuzzification (m) 1.7. Separating the fluviometric stations by clusters through degrees of pertinence, the largest number of stations were obtained in Cluster 3 (83 stations), followed by Cluster 4 (13 stations) and Clusters 1 and 2 (7 stations in each cluster), and only 4 stations were not inserted in any cluster, being classified as nebulae, where the groups were determined practically by the distribution of the AD and SD variables. The smaller areas of coverage, analyzed flows and the smaller amount of drainage in the coverage area of the stations were found in Cluster 3, considering they were well spread in the state of Paraná. Clusters 1 and 4 were intermediate among the other clusters in all parameters evaluated. The Fuzzy C-Means algorithm proved to be efficient for the grouping of fluviometric stations in the state of Paraná, where it was possible to find the characteristics of each cluster formed, without overlapping of data in the analyzed variables.
O delineamento de regiões hidrologicamente homogêneas (RHH) é um procedimento essencial para provimento de informações indispensáveis aos trabalhos de modelagem, planejamento e gestão de recursos hídricos, principalmente quando se tem a necessidade de realizar a regionalização de vazões, visando estimar a disponibilidade hídrica em seções desprovidas de medições. A definição de estratégias de manejo e conservação dos recursos naturais depende de informações obtidas por meio da identificação de RHH, sendo também um dos passos de um estudo de regionalização de vazões. Assim, este trabalho tem como objetivo a identificação das RHH no estado do Paraná através do método de agrupamento Fuzzy C-Means. Foram utilizadas 9 variáveis, individualizadas para as 114 estações fluviométricas adotadas, sendo 4 variáveis dependentes referentes às vazões características (vazão média anual de longa duração (Qmld), vazão mínima anual com sete dias de duração e período de retorno de 10 anos (Q7,10), vazões associadas às permanências de 95% (Q95) e 90% (Q90)) e 5 independentes referentes às características morfometrias da estação (área de drenagem (AD – m²), soma das drenagens (SD - m), densidade de drenagem (DD – 1/m) e a localização geográfica (latitude - Lat e longitude - Long). A partir da análise de componentes principais (ACP) identificou-se as variáveis Qmld, DD, Lat e Long como as menos representativas, sendo excluídas do estudo, dando procedência à análise de agrupamentos apenas com as variáveis AD, SD, Q90, Q95 e Q7,10. Aplicou-se o Fuzzy C-Means para as variáveis escolhidas, sendo que a menor função objetiva encontrada foi para 4 Clusters no índice de fuzzificação (m) 1,7. Separando as estações fluviométricas por clusters através dos graus de pertinência, obtivemos o maior número de estações no Cluster 3 (83 estações), seguidos do Cluster 4 (13 estações) e dos Clusters 1 e 2 (7 estações em cada cluster), e apenas 4 estações não foram inseridas em nenhum cluster, sendo classificadas como nebulosas, sendo que os grupos foram determinados praticamente pela distribuição das variáveis AD e SD. As menores áreas de abrangência, vazões analisadas e as menores quantidade de drenagens na área de cobertura das estações foram encontras no Cluster 3, que estão bem espalhadas no estado do Paraná. Já os Clusters 1 e 4 ficaram intermediários entre os demais clusters em todos os parâmetros avaliados. O algoritmo Fuzzy C-Means se mostrou eficiente para o agrupamento das estações fluviométricas no estado do Paraná, onde foi possível encontrar as características de cada cluster formado, sem haver sobreposição de dados nos intervalos das variáveis analisadas.

APA, Harvard, Vancouver, ISO, and other styles

23

Ronzhina, Marina. "Klasifikace mikrospánku analýzou EEG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-217965.

Full text

Abstract:

This master thesis deals with detection of microsleep on the basis of the changes in power spectrum of EEG signal. The results of time-frequency analysis are input values for the classifikation. Proposed classification method uses fuzzy logic. Four classifiers were designed, which are based on a fuzzy inference systems, that are differ in rule base. The results of fuzzy clustering are used for the design of rule premises membership functions. The two classifiers microsleep detection use only alpha band of the EEG signal’s spectrogram then allows the detection of the relaxation state of a person. Unlike to first and second classifiers, the third classifier is supplemented with rules for the delta band, which makes it possible to distinguish the 3 states: vigilance, relaxation and somnolence. The fourth classifier inference system includes the rules for the whole spectrum band. The method was implemented by computer. The program with a graphical user interface was created.

APA, Harvard, Vancouver, ISO, and other styles

24

Desai, Jitamitra. "Solving Factorable Programs with Applications to Cluster Analysis, Risk Management, and Control Systems Design." Diss., Virginia Tech, 2005. http://hdl.handle.net/10919/28211.

Full text

Abstract:

Ever since the advent of the simplex algorithm, linear programming (LP) has been extensively used with great success in many diverse fields. The field of discrete optimization came to the forefront as a result of the impressive developments in the area of linear programming. Although discrete optimization problems can be viewed as belonging to the class of nonconvex programs, it has only been in recent times that optimization research has confronted the more formidable class of continuous nonconvex optimization problems, where the objective function and constraints are often highly nonlinear and nonconvex functions, defined in terms of continuous (and bounded) decision variables. Typical classes of such problems involve polynomial, or more general factorable functions.

This dissertation focuses on employing the Reformulation-Linearization Technique (RLT) to enhance model formulations and to design effective solution techniques for solving several practical instances of continuous nonconvex optimization problems, namely, the hard and fuzzy clustering problems, risk management problems, and problems arising in control systems.

Under the umbrella of the broad RLT framework, the contributions of this dissertation focus on developing models and algorithms along with related theoretical and computational results pertaining to three specific application domains. In the basic construct, through appropriate surrogation schemes and variable substitution strategies, we derive strong polyhedral approximations for the polynomial functional terms in the problem, and then rely on the demonstrated (robust) ability of the RLT for determining global optimal solutions for polynomial programming problems. The convergence of the proposed branch-and-bound algorithm follows from the tailored branching strategy coupled with consistency and exhaustive properties of the enumeration tree. First, we prescribe an RLT-based framework geared towards solving the hard and fuzzy clustering problems. In the second endeavor, we examine two risk management problems, providing novel models and algorithms. Finally, in the third part, we provide a detailed discussion on studying stability margins for control systems using polynomial programming models along with specialized solution techniques.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

25

Kruse, Britta. "Fuzzy-Technologie versus multivariate Statistik versus univariate Statistik ein Verfahrensvergleich am Beispiel der geotechnischen Datenanalyse von Geschiebemergel." Berlin mbv, Mensch-und-Buch-Verl, 2009. http://d-nb.info/995878218/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Hong, Sui. "Experiments with K-Means, Fuzzy c-Means and Approaches to Choose K and C." Honors in the Major Thesis, University of Central Florida, 2006. http://digital.library.ucf.edu/cdm/ref/collection/ETH/id/1224.

Full text

Abstract:

This item is only available in print in the UCF Libraries. If this is your Honors Thesis, you can help us make it available online for use by researchers around the world by following the instructions on the distribution consent form at http://library.ucf
Bachelors
Engineering and Computer Science
Computer Engineering

APA, Harvard, Vancouver, ISO, and other styles

27

Wong, Cheok Meng. "A distributed particle swarm optimization for fuzzy c-means algorithm based on an apache spark platform." Thesis, University of Macau, 2018. http://umaclib3.umac.mo/record=b3950604.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Hudson, Cody Landon. "Protein structure analysis and prediction utilizing the Fuzzy Greedy K-means Decision Forest model and Hierarchically-Clustered Hidden Markov Models method." Thesis, University of Central Arkansas, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=1549796.

Full text

Abstract:

Structural genomics is a field of study that strives to derive and analyze the structural characteristics of proteins through means of experimentation and prediction using software and other automatic processes. Alongside implications for more effective drug design, the main motivation for structural genomics concerns the elucidation of each protein’s function, given that the structure of a protein almost completely governs its function. Historically, the approach to derive the structure of a protein has been through exceedingly expensive, complex, and time consuming methods such as x-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy.

In response to the inadequacies of these methods, three families of approaches developed in a relatively new branch of computer science known as bioinformatics. The aforementioned families include threading, homology-modeling, and the de novo approach. However, even these methods fail either due to impracticalities, the inability to produce novel folds, rampant complexity, inherent limitations, etc. In their stead, this work proposes the Fuzzy Greedy K-means Decision Forest model, which utilizes sequence motifs that transcend protein family boundaries to predict local tertiary structure, such that the method is cheap, effective, and can produce semi-novel folds due to its local (rather than global) prediction mechanism. This work further extends the FGK-DF model with a new algorithm, the Hierarchically Clustered-Hidden Markov Models (HC-HMM) method to extract protein primary sequence motifs in a more accurate and adequate manner than currently exhibited by the FGK-DF model, allowing for more accurate and powerful local tertiary structure predictions. Both algorithms are critically examined, their methodology thoroughly explained and tested against a consistent data set, the results thereof discussed at length.

APA, Harvard, Vancouver, ISO, and other styles

29

SILVA, Alexandre Márcio Melo da. "Controle energeticamente eficiente de múltiplos saltos para redes de sensores sem fio heterogêneas utilizando lógica fuzzy." Universidade Federal do Pará, 2014. http://repositorio.ufpa.br/jspui/handle/2011/9016.

Full text

Abstract:

Submitted by Hellen Luz (hellencrisluz@gmail.com) on 2017-08-01T18:31:48Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Dissertacao_ControleEnergeticamenteEficiente.pdf: 1174753 bytes, checksum: 5cd54c6a58e7be0ea1b96f19908312ce (MD5)
Approved for entry into archive by Irvana Coutinho (irvana@ufpa.br) on 2017-08-22T13:01:13Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Dissertacao_ControleEnergeticamenteEficiente.pdf: 1174753 bytes, checksum: 5cd54c6a58e7be0ea1b96f19908312ce (MD5)
Made available in DSpace on 2017-08-22T13:01:13Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Dissertacao_ControleEnergeticamenteEficiente.pdf: 1174753 bytes, checksum: 5cd54c6a58e7be0ea1b96f19908312ce (MD5) Previous issue date: 2014-03-21
CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico
O presente trabalho objetiva demonstrarum controle centralizado para eleger Cluster Heads (CHs) mais adequados, admitindo trêsníveis de heterogeneidade e uma comunicação de múltiplos saltos entre Cluster Heads. O controle centralizado utiliza o algoritmo k-means, responsável pela divisão dos clusters e Lógica Fuzzy para eleição do Cluster Head e seleção da melhor rota de comunicação entre os eleitos.Os resultados indicam que a proposta apresentada oferece grandes vantagens comparado aos algoritmos anteriores de eleição, permitindo selecionar os nós mais adequados para líderes do grupo a cada round com base nos valores do Sistema Fuzzy, como também, a utilização da Lógica Fuzzy como ferramenta de decisão para implementação de múltiplos saltos entre CHs, uma vez que minimiza a dissipação de energia dos CHs selecionados mais afastados do ponto de coleta. A inserção de três níveis de heterogeneidade, correspondente aos sensores normais, avançados e super sensores, contribui consideravelmente para o aumento do período de estabilidade da rede. Outra grande contribuição obtida a partir dos resultados é a utilização de um controle central na estação base (EB) apresentando vantagens sobre o processamento local de informações em cada nó, processo este encontrado nos algoritmos tradicionais para eleição de CHs.A solução proposta comprovou que a eleição do CH mais eficiente, considerando sua localização e discrepâncias de níveis de energia, como também, na inclusão de novos níveis de heterogeneidade, permite aumentar o período de estabilidade da rede, ou seja, o período que a rede é totalmente funcional, aumentando consideravelmente o tempo de vida útil em Redes de Sensores Sem Fio (RSSF)heterogêneas.
This study presents a centralized control to elect appropriate Cluster Heads (CHs), assuming three levels of heterogeneity and multi-hop communication between Cluster Heads. The centralized control uses the k-means algorithm, responsible for the division of clusters and Fuzzy Logic to elect the Cluster Head and selecting the best route of communication between elected. The results indicate that the proposal offers great advantages, allowing us to select the most suitable nodes for group leaders at each round based on the Fuzzy System values, and also the use of Fuzzy Logic as a decision tool to implement multiple hops between CHs, since it minimizes the power dissipation of the selected CHs more distant from the collection point. The insertion of three levels of heterogeneity,corresponding to normal, advanced and super sensors, contributes considerably to increasing the period of network stability. Another great contribution obtained from the is the use of a central control in base station (BS) with advantages over local information processing in each node, a process usually found in traditional algorithms for electing CHs. The proposed solution proved that the election of the more efficient CH, considering its location and energy levels discrepancies, and also, the inclusion of new heterogeneity levels, allows to increase the networkstability period, ie, the period that the network is fully functional, greatly increasing the useful lifetime in heterogeneous WSN.

APA, Harvard, Vancouver, ISO, and other styles

30

Gu, Yuhua. "Ant clustering with consensus." [Tampa, Fla] : University of South Florida, 2009. http://purl.fcla.edu/usf/dc/et/SFE0002959.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Arnaldo, Helo?na Alves. "Novos m?todos determin?sticos para gerar centros iniciais dos grupos no algoritmo fuzzy C-Means e variantes." Universidade Federal do Rio Grande do Norte, 2014. http://repositorio.ufrn.br:8080/jspui/handle/123456789/18109.

Full text

Abstract:

Made available in DSpace on 2014-12-17T15:48:11Z (GMT). No. of bitstreams: 1 HeloinaAA_DISSERT.pdf: 1661373 bytes, checksum: df9fe39185a27ded472f2f72284acdf6 (MD5) Previous issue date: 2014-02-24
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
Data clustering is applied to various fields such as data mining, image processing and pattern recognition technique. Clustering algorithms splits a data set into clusters such that elements within the same cluster have a high degree of similarity, while elements belonging to different clusters have a high degree of dissimilarity. The Fuzzy C-Means Algorithm (FCM) is a fuzzy clustering algorithm most used and discussed in the literature. The performance of the FCM is strongly affected by the selection of the initial centers of the clusters. Therefore, the choice of a good set of initial cluster centers is very important for the performance of the algorithm. However, in FCM, the choice of initial centers is made randomly, making it difficult to find a good set. This paper proposes three new methods to obtain initial cluster centers, deterministically, the FCM algorithm, and can also be used in variants of the FCM. In this work these initialization methods were applied in variant ckMeans.With the proposed methods, we intend to obtain a set of initial centers which are close to the real cluster centers. With these new approaches startup if you want to reduce the number of iterations to converge these algorithms and processing time without affecting the quality of the cluster or even improve the quality in some cases. Accordingly, cluster validation indices were used to measure the quality of the clusters obtained by the modified FCM and ckMeans algorithms with the proposed initialization methods when applied to various data sets
Agrupamento de dados ? uma t?cnica aplicada a diversas ?reas como minera??o de dados, processamento de imagens e reconhecimento de padr?es. Algoritmos de agrupamento particionam um conjunto de dados em grupos, de tal forma, que elementos dentro de um mesmo grupo tenham alto grau de similaridade, enquanto elementos pertencentes a diferentes grupos tenham alto grau de dissimilaridade. O algoritmo Fuzzy C-Means (FCM) ? um dos algoritmos de agrupamento fuzzy de dados mais utilizados e discutidos na literatura. O desempenho do FCM ? fortemente afetado pela sele??o dos centros iniciais dos grupos. Portanto, a escolha de um bom conjunto de centros iniciais ? muito importante para o desempenho do algoritmo. No entanto, no FCM, a escolha dos centros iniciais ? feita de forma aleat?ria, tornando dif?cil encontrar um bom conjunto. Este trabalho prop?e tr?s novos m?todos para obter os centros iniciais dos grupos, de forma determin?stica, no algoritmo FCM, e que podem tamb?m ser usados em variantes do FCM. Neste trabalho esses m?todos de inicializa??o foram aplicados na variante ckMeans. Com os m?todos propostos, pretende-se obter um conjunto de centros iniciais que esteja pr?ximo dos centros reais dos grupos. Com estas novas abordagens de inicializa??o deseja-se reduzir o n?mero de itera??es para estes algoritmos convergirem e o tempo de processamento, sem afetar a qualidade do agrupamento ou at? melhorar a qualidade em alguns casos. Neste sentido, foram utilizados ?ndices de valida??o de agrupamento para medir a qualidade dos agrupamentos obtidos pelos algoritmos FCM e ckMeans, modificados com os m?todos de inicializa??o propostos, quando aplicados a diversas bases de dados

APA, Harvard, Vancouver, ISO, and other styles

32

Budayan, Cenk. "Strategic Group Analysis: Strategic Perspective, Differentiation And Performance In Construction." Phd thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609676/index.pdf.

Full text

Abstract:

The aim of strategic group analysis is to find out if clusters of firms that have a similar strategic position exist within an industry or not. In this thesis, by using a conceptual framework that reflects the strategic context, contents and process of construction companies and utilising alternative clustering methods such as traditional cluster analysis, self-organizing maps, and fuzzy C-means technique, a strategic group analysis was conducted for the Turkish construction industry. Results demonstrate that there are three strategic groups among which significant performance differences exist. Self-organising maps provide a visual representation of group composition and help identification of hybrid structures. Fuzzy C-means technique reveals the membership degrees of a firm to each strategic group. It is recommended that real strategic group structure can only be identified by using alternative cluster analysis methods. The positive effect of differentiation strategy on achieving competitive advantage is widely acknowledged in the literature and proved to be valid for the Turkish construction industry as a result of strategic group analysis. In this study, a framework is proposed to model the differentiation process in construction. The relationships between the modes and drivers of differentiation are analyzed by structural equation modeling. The results demonstrate that construction companies can either differentiate on quality or productivity. Project management related factors extensively influence productivity differentiation whereas they influence quality differentiation indirectly. Corporate management related factors only affect quality differentiation. Moreover, resources influence productivity differentiation directly whereas they have an indirect effect on quality differentiation.

APA, Harvard, Vancouver, ISO, and other styles

33

Koprnicky, Miroslav. "Towards a Versatile System for the Visual Recognition of Surface Defects." Thesis, University of Waterloo, 2005. http://hdl.handle.net/10012/888.

Full text

Abstract:

Automated visual inspection is an emerging multi-disciplinary field with many challenges; it combines different aspects of computer vision, pattern recognition, automation, and control systems. There does not exist a large body of work dedicated to the design of generalized visual inspection systems; that is, those that might easily be made applicable to different product types. This is an important oversight, in that many improvements in design and implementation times, as well as costs, might be realized with a system that could easily be made to function in different production environments.

This thesis proposes a framework for generalizing and automating the design of the defect classification stage of an automated visual inspection system. It involves using an expandable set of features which are optimized along with the classifier operating on them in order to adapt to the application at hand. The particular implementation explored involves optimizing the feature set in disjoint sets logically grouped by feature type to keep search spaces reasonable. Operator input is kept at a minimum throughout this customization process, since it is limited only to those cases in which the existing feature library cannot adequately delineate the classes at hand, at which time new features (or pools) may have to be introduced by an engineer with experience in the domain.

Two novel methods are put forward which fit well within this framework: cluster-space and hybrid-space classifiers. They are compared in a series of tests against both standard benchmark classifiers, as well as mean and majority vote multi-classifiers, on feature sets comprised of just the logical feature subsets, as well as the entire feature sets formed by their union. The proposed classifiers as well as the benchmarks are optimized with both a progressive combinatorial approach and with an genetic algorithm. Experimentation was performed on true colour industrial lumber defect images, as well as binary hand-written digits.

Based on the experiments conducted in this work, it was found that the sequentially optimized multi hybrid-space methods are capable of matching the performances of the benchmark classifiers on the lumber data, with the exception of the mean-rule multi-classifiers, which dominated most experiments by approximately 3% in classification accuracy. The genetic algorithm optimized hybrid-space multi-classifier achieved best performance however; an accuracy of 79. 2%.

The numeral dataset results were less promising; the proposed methods could not equal benchmark performance. This is probably because the numeral feature-sets were much more conducive to good class separation, with standard benchmark accuracies approaching 95% not uncommon. This indicates that the cluster-space transform inherent to the proposed methods appear to be most useful in highly dependant or confusing feature-spaces, a hypothesis supported by the outstanding performance of the single hybrid-space classifier in the difficult texture feature subspace: 42. 6% accuracy, a 6% increase over the best benchmark performance.

The generalized framework proposed appears promising, because classifier performance over feature sets formed by the union of independently optimized feature subsets regularly met and exceeded those classifiers operating on feature sets formed by the optimization of the feature set in its entirety. This finding corroborates earlier work with similar results [3, 9], and is an aspect of pattern recognition that should be examined further.

APA, Harvard, Vancouver, ISO, and other styles

34

Kheriji, Sabrine. "Design of an Energy-Aware Unequal Clustering Protocol based on Fuzzy Logic for Wireless Sensor Networks." Universitätsverlag Chemnitz, 2020. https://monarch.qucosa.de/id/qucosa%3A73303.

Full text

Abstract:

Energy consumption is a major concern in Wireless Sensor Networks (WSNs) resulting in a strong demand for energy-aware communication technologies. In this context, several unequal cluster-based routing protocols have been proposed. However, few of them adopt energetic analysis models for the calculation of the optimal cluster radius and several protocols can not realize an optimal workload balance between sensor nodes. In this scope, the aim of the dissertation is to develop a cluster-based routing protocol for improving energy efficiency in WSN. We propose a Fuzzy-based Energy-Aware Unequal Clustering algorithm (FEAUC) with circular partitioning to balance the energy consumption between sensor nodes and solve the hotspot problem created by a multi-hop communication. The developed FEAUC involves mainly four phases: An off-line phase, a cluster formation phase, a cooperation phase and data collection phase. During the off-line phase, an energy analysis is performed to calculate the radius of each ring and the optimal cluster radius per ring. The cluster formation phase is based on a fuzzy logic approach for the cluster head (CH) selection. The cooperation phase aims to define an intermediate node as a router between different CHs. While, in the data collection phase, transmitting data packet from sensor nodes to their appropriate CHs is defined as an intra-cluster communication, and transmitting data from one CH to another until reaching the base station, is defined as an inter-cluster communication. The feasibility of the developed FEAUC is demonstrated by elaborating comparison with selected referred unequal clustering algorithms considering different parameters, mainly, the energy consumption, battery lifetime, time to first node shuts down (FND), time of half of nodes off-line (HND) and time to last node dies (LND). Although, the developed FEAUC is intended to enhance the network lifetime by distributing the large load of CH tasks equally among the normal nodes, running the clustering process in each round is an additional burden, which can significantly drain the remaining energy. For this reason, the FEAUC based protocol has been further developed to become a fault tolerant algorithm (FEAUC-FT). It supports the fault tolerance by using backup CHs to avoid the re-clustering process in certain rounds or by building further routing paths in case of a link failure between different CHs. The validation of the developed FEAUC in real scenarios has been performed. Some sensor nodes, powered with batteries, are deployed in a circular area forming clusters. Performance evaluations are carried out by realistic scenarios and tested for a real deployment using the low-power wireless sensor node panStamp. To complete previous works, as a step of proof of concept, a smart irrigation system is designed, called Air-IoT. Furthermore, a real-time IoT-based sensor node architecture to control the quantity of water in some deployed nodes is introduced. To this end, a cloud-connected wireless network to monitor the soil moisture and temperature is well-designed. Generally, this step is essential to validate and evaluate the proposed unequal cluster-based routing algorithm in a real demonstrator. The proposed prototype guarantees both real-time monitoring and reliable and cost-effective transmission between each node and the base station.:1 Introduction 2 Theoretical background 3 State of the art of unequal cluster-based routing protocols 4 FEAUC: Fuzzy-based Energy-Aware Unequal Clustering 5 Experimental validation of the developed unequal clustering protocol 6 Real application to specific uses cases 7 Conclusions and future research directions
Der Energieverbrauch ist ein Hauptanliegen in drahtlosen Sensornetzwerken (WSNs), was zu einer starken Nachfrage nach energiebewussten Kommunikationstechnologien führt. In diesem Zusammenhang wurden mehrere ungleiche clusterbasierte Routing-Protokolle vorgeschlagen. Allerdings verwenden nur die wenigsten energetische Analysemodelle für die Berechnung des optimalen Cluster-Radius, und mehrere Protokolle können keine optimale Auslastungsbalance zwischen Sensorknoten realisieren. In diesem Zusammenhang ist es das Ziel der Dissertation, ein clusterbasiertes Routing-Protokoll zur Verbesserung der Energieeffizienz im WSN zu entwickeln. Wir schlagen einen Fuzzy-basierten Energy-Aware Unequal Clustering-Algorithmus (FEAUC) mit zirkulärer Partitionierung vor, um den Energieverbrauch zwischen Sensorknoten auszugleichen und das durch eine Multi-Hop-Kommunikation entstehende Hotspot-Problem zu lösen. Der entwickelte FEAUC umfasst hauptsächlich vier Phasen: Eine Offline-Phase, eine Clusterbildungsphase, eine Kooperationsphase und eine Phase der Datensammlung. Während der Offline-Phase wird eine Energieanalyse durchgeführt, um den Radius jedes Ringes und den optimalen Cluster- Radius pro Ring zu berechnen. Die Clusterbildungsphase basiert auf einem Fuzzy-Logik-Ansatz für die Clusterkopf (CH)-Auswahl. Die Kooperationsphase zielt darauf ab, einen Zwischenknoten als einen Router zwischen verschiedenen CHs zu definieren. In der Datensammelphase wird die Übertragung von Datenpaketen von Sensorknoten zu ihren entsprechenden CHs als eine Intra-Cluster-Kommunikation definiert, während die Übertragung von Daten von einem CH zu einem anderen CH bis zum Erreichen der Basisstation als eine Inter-Cluster-Kommunikation definiert wird. Die Machbarkeit des entwickelten FEAUC wird durch die Ausarbeitung eines Vergleichs mit ausgewählten referenzierten ungleichen Clustering-Algorithmen unter Berücksichtigung verschiedener Parameter demonstriert, hauptsächlich des Energieverbrauchs, der Batterielebensdauer, der Zeit bis zum Abschalten des ersten Knotens (FND), der Zeit, in der die Hälfte der Knoten offline ist (HND) und der Zeit bis zum letzten Knoten stirbt (LND). Obwohl mit dem entwickelten FEAUC die Lebensdauer des Netzwerks erhöht warden soll, indem die große Last der CH-Aufgaben gleichmäßig auf die übrigen Knoten verteilt wird, stellt die Durchführung des Clustering-Prozesses in jeder Runde eine zusätzliche Belastung dar, die die verbleibende Energie erheblich entziehen kann. Aus diesem Grund wurde das auf FEAUC basierende Protokoll zu einem fehlerto-leranten Algorithmus (FEAUC-FT) weiterentwickelt. Er unterstützt die Fehlerto-leranz durch die Verwendung von Backup-CHs zur Vermeidung des Re-Clustering-Prozesses in bestimmten Runden oder durch den Aufbau weiterer Routing-Pfade im Falle eines Verbindungsausfalls zwischen verschiedenen CHs. Die Validierung des entwickelten FEAUC in realen Szenarien ist durchgeführt worden. Einige Sensorknoten, die mit Batterien betrieben werden, sind in einem kreisförmigen Bereich angeordnet und bilden Cluster. Leistungsbewertungen warden anhand realistischer Szenarien durchgeführt und für einen realen Einsatz unter Verwendung des drahtlosen Low-Power-Sensorknoten panStamp getestet. Zur Vervollständigung früherer Arbeiten wird als Schritt des Proof-of-Concept ein intelligentes Bewässerungssystem mit der Bezeichnung Air-IoT entworfen. Darüber hinaus wird eine IoT-basierte Echtzeit-Sensorknotenarchitektur zur Kontrolle derWassermenge in einigen eingesetzten Knoten eingeführt. Zu diesem Zweck wird ein mit der Cloud verbundenes drahtloses Netzwerk zur Überwachung der Bodenfeuchtigkeit und -temperatur gut konzipiert. Im Allgemeinen ist dieser Schritt unerlässlich, um den vorgeschlagenen ungleichen clusterbasierten Routing-Algorithmus in einem realen Demonstrator zu validieren und zu bewerten.Der vorgeschlagene Prototyp garantiert sowohl Echtzeit-Überwachung als auch zuverlässige und kostengünstige Übertragung zwischen jedem Knoten und der Basisstation.:1 Introduction 2 Theoretical background 3 State of the art of unequal cluster-based routing protocols 4 FEAUC: Fuzzy-based Energy-Aware Unequal Clustering 5 Experimental validation of the developed unequal clustering protocol 6 Real application to specific uses cases 7 Conclusions and future research directions

APA, Harvard, Vancouver, ISO, and other styles

35

MACIEL, Christiano do Carmo de Oliveira. "Estratégia de redução de consumo de energia em redes de sensores sem fio heterogêneas utilizando lógica fuzzy." Universidade Federal do Pará, 2012. http://repositorio.ufpa.br/jspui/handle/2011/3373.

Full text

Abstract:

Submitted by Irvana Coutinho (irvana@ufpa.br) on 2013-01-23T15:09:34Z No. of bitstreams: 2 license_rdf: 23898 bytes, checksum: e363e809996cf46ada20da1accfcd9c7 (MD5) Dissertacao_EstrategiaReducaoConsumo.pdf: 2434652 bytes, checksum: bf2428fb8f0caeaf3737a7f0ba4cd7e5 (MD5)
Approved for entry into archive by Ana Rosa Silva(arosa@ufpa.br) on 2013-01-23T17:11:16Z (GMT) No. of bitstreams: 2 license_rdf: 23898 bytes, checksum: e363e809996cf46ada20da1accfcd9c7 (MD5) Dissertacao_EstrategiaReducaoConsumo.pdf: 2434652 bytes, checksum: bf2428fb8f0caeaf3737a7f0ba4cd7e5 (MD5)
Made available in DSpace on 2013-01-23T17:11:16Z (GMT). No. of bitstreams: 2 license_rdf: 23898 bytes, checksum: e363e809996cf46ada20da1accfcd9c7 (MD5) Dissertacao_EstrategiaReducaoConsumo.pdf: 2434652 bytes, checksum: bf2428fb8f0caeaf3737a7f0ba4cd7e5 (MD5) Previous issue date: 2012
O avanço nas áreas de comunicação sem fio e microeletrônica permite o desenvolvimento de equipamentos micro sensores com capacidade de monitorar grandes regiões. Formadas por milhares de nós sensores, trabalhando de forma colaborativa, as Redes de Sensores sem Fio apresentam severas restrições de energia, devido à capacidade limitada das baterias dos nós que compõem a rede. O consumo de energia pode ser minimizado, permitindo que apenas alguns nós especiais, chamados de Cluster Head, sejam responsáveis por receber os dados dos nós que formam seu cluster e propagar estes dados para um ponto de coleta denominado Estação Base. A escolha do Cluster Head ideal influencia no aumento do período de estabilidade da rede, maximizando seu tempo de vida útil. A proposta, apresentada nesta dissertação, utiliza Lógica Fuzzy e algoritmo k-means com base em informações centralizadas na Estação Base para eleição do Cluster Head ideal em Redes de Sensores sem Fio heterogêneas. Os critérios usados para seleção do Cluster Head são baseados na centralidade do nó, nível de energia e proximidade para a Estação Base. Esta dissertação apresenta as desvantagens de utilização de informações locais para eleição do líder do cluster e a importância do tratamento discriminatório sobre as discrepâncias energéticas dos nós que formam a rede. Esta proposta é comparada com os algoritmos Low Energy Adaptative Clustering Hierarchy (LEACH) e Distributed energy-efficient clustering algorithm for heterogeneous Wireless sensor networks (DEEC). Esta comparação é feita, utilizando o final do período de estabilidade, como também, o tempo de vida útil da rede.
The increase in wireless communication and microelectronic devices enables the development of micro sensors with monitoring capable for large areas. Consisting of thousands of sensor nodes, working collaboratively, the Wireless sensor networks have severe energy constraints, due to the limited capacity of batteries of the nodes that compose the network. The power consumption can be minimized by allowing only a few special nodes, called Cluster Head, are responsible for receiving data from its cluster nodes that form and propagate this data to a collection point called Base Station. The choice of optimum cluster head influence on increasing the period of stability of the network, maximizing their useful life. The proposal, presented in this thesis, uses Fuzzy Logic and k-means algorithm based on centralized information on Base Station for election of ideal Cluster Head for Heterogeneous Wireless Sensors Networks. The criteria used to select the ideal Cluster Head are based on the node centrality, energy level and proximity to the Base Station. This dissertation presents the disadvantages when the local information are used to the cluster leader election and the importance of discriminatory treatment on the energy discrepancies in the network. This proposal is compared with the Low Energy Adaptive Clustering Hierarchy (LEACH) and Distributed energy-efficient clustering (DEEC) algorithms. This comparison is evaluated using the end of the stability period and the lifetime of the network.

APA, Harvard, Vancouver, ISO, and other styles

36

Szabo, Alexandre. "Agrupamento nebuloso de dados baseado em enxame de partículas: seleção por métodos evolutivos e combinação via relação nebulosa do tipo-2." Universidade Presbiteriana Mackenzie, 2014. http://tede.mackenzie.br/jspui/handle/tede/1527.

Full text

Abstract:

Made available in DSpace on 2016-03-15T19:38:52Z (GMT). No. of bitstreams: 1 Alexandre Szabo.pdf: 2177168 bytes, checksum: 8b503cd1beb4c700f1905e07a0b08362 (MD5) Previous issue date: 2014-10-29
Fundação de Amparo a Pesquisa do Estado de São Paulo
Clustering usually treats objects as belonging to mutually exclusive clusters, what is usually im-precise, because an object may belong to more than one cluster simultaneously with different membership degrees. The clustering algorithms, both crisp and fuzzy, have a number of parameters to be adjusted so that they present the best performance for a given database. Furthermore, it is known that no single algorithm is better than all the others for all problem classes, and the combi-nation of solutions found by various algorithms (or the same algorithm with different parameters) may lead to a global solution that is better than those found by individual algorithms, including the best one. It is within this context that the present thesis proposes a new fuzzy clustering algo-rithm inspired by the behavior of particle swarms and, then, introduces a new form of combining the clustering algorithms using concepts from Type-2 fuzzy sets.
Da maneira tradicional o agrupamento trata os objetos que compõem a base como pertencentes a grupos mutuamente exclusivos, o que nem sempre é verdade, pois um objeto pode pertencer a mais de um grupo com diferentes graus de pertinência. Os algoritmos de agrupamento, sejam eles convencionais ou nebulosos (capazes de tratar múltiplas pertinências simultaneamente), possuem diversos parâmetros a serem ajustados de tal forma que ofereçam o melhor desempenho para uma base de dados. Além disso, é sabido que nenhum algoritmo é superior a todos os outros para todas as classes de problemas e que combinar soluções fornecidas por diferentes algoritmos pode levar a uma solução global superior a todas as soluções individuais, inclusive à melhor. É nesse contexto que a presente tese propõe um novo algoritmo de agrupamento nebuloso de dados inspirado no comportamento de enxames de partículas e, em seguida, propõe uma nova forma de realizar combinações (ensembles) de algoritmos de agrupamento usando conceitos da teoria de conjuntos nebulosos do Tipo-2.

APA, Harvard, Vancouver, ISO, and other styles

37

Palomino, Lizeth Vargas. "Técnicas de inteligência artificial aplicadas ao método de monitoramento de integridade estrutural baseado na impedância eletromecânica para monitoramento de danos em estruturas aeronáuticas." Universidade Federal de Uberlândia, 2012. https://repositorio.ufu.br/handle/123456789/14726.

Full text

Abstract:

Conselho Nacional de Desenvolvimento Científico e Tecnológico
The basic concept of impedance-based structure health monitoring is measuring the variation of the electromechanical impedance of the structure as caused by the presence of damage by using patches of piezoelectric material bonded on the surface of the structure (or embedded into). The measured electrical impedance of the PZT patch is directly related to the mechanical impedance of the structure. That is why the presence of damage can be detected by monitoring the variation of the impedance signal. In order to quantify damage, a metric is specially defined, which allows to assign a characteristic scalar value to the fault. This study initially evaluates the influence of environmental conditions in the impedance measurement, such as temperature, magnetic fields and ionic environment. The results show that the magnetic field does not influence the impedance measurement and that the ionic environment influences the results. However, when the sensor is shielded, the effect of the ionic environment is significantly reduced. The influence of the sensor geometry has also been studied. It has been established that the shape of the PZT patch (rectangular or circular) has no influence on the impedance measurement. However, the position of the sensor is an important issue to correctly detect damage. This work presents the development of a low-cost portable system for impedance measuring to automatically measure and store data from 16 PZT patches, without human intervention. One fundamental aspect in the context of this work is to characterize the damage type from the various impedance signals collected. In this sense, the techniques of artificial intelligence known as neural networks and fuzzy cluster analysis were tested for classifying damage of aircraft structures, obtaining satisfactory results. One last contribution of the present work is the study of the performance of the electromechanical impedance-based structural health monitoring technique to detect damage in structures under dynamic loading. Encouraging results were obtained for this aim.
O conceito básico da técnica de integridade estrutural baseada na impedância tem a ver com o monitoramento da variação da impedância eletromecânica da estrutura, causada pela presença alterações estruturais, através de pastilhas de material piezelétrico coladas na superfície da estrutura ou nela incorporadas. A impedância medida se relaciona com a impedância mecânica da estrutura. A partir da variação dos sinais de impedância pode-se concluir pela existência ou não de uma falha. Para quantificar esta falha, métricas de dano são especialmente definidas, permitindo atribuir-lhe um valor escalar característico. Este trabalho pretende inicialmente avaliar a influência de algumas condições ambientais, tais como os campos magnéticos e os meios iônicos na medição de impedância. Os resultados obtidos mostram que os campos magnéticos não tem influência na medição de impedância e que os meios iônicos influenciam os resultados; entretanto, ao blindar o sensor, este efeito se reduz consideravelmente. Também foi estudada a influencia da geometria, ou seja, do formato do PZT e da posição do sensor com respeito ao dano. Verificou-se que o formato do PZT não tem nenhuma influência na medição e que a posição do sensor é importante para detectar corretamente o dano. Neste trabalho se apresenta o desenvolvimento de um sistema de medição de impedância de baixo custo e portátil que tem a capacidade de medir e armazenar a medição de 16 PZTs sem a necessidade de intervenção humana. Um aspecto de fundamental importância no contexto deste trabalho é a caracterização do dano a partir dos sinais de impedância coletados. Neste sentido, as técnicas de inteligência artificial conhecidas como redes neurais e análises de cluster fuzzy, foram testadas para classificar danos em estruturas aeronáuticas, obtendo resultados satisfatórios para esta tarefa. Uma última contribuição deste trabalho é o estudo do comportamento da técnica de monitoramento de integridade estrutural baseado na impedância eletromecânica na detecção de danos em estruturas submetidas a carregamento dinâmico. Os resultados obtidos mostram que a técnica funciona adequadamente nestes casos.
Doutor em Engenharia Mecânica

APA, Harvard, Vancouver, ISO, and other styles

38

Quéré, Romain. "Quelques propositions pour la comparaison de partitions non strictes." Phd thesis, Université de La Rochelle, 2012. http://tel.archives-ouvertes.fr/tel-00950514.

Full text

Abstract:

Cette thèse est consacrée au problème de la comparaison de deux partitions non strictes (floues/probabilistes, possibilistes) d'un même ensemble d'individus en plusieurs clusters. Sa résolution repose sur la définition formelle de mesures de concordance reprenant les principes des mesures historiques développées pour la comparaison de partitions strictes et trouve son application dans des domaines variés tels que la biologie, le traitement d'images, la classification automatique. Selon qu'elles s'attachent à observer les relations entre les individus décrites par chacune des partitions ou à quantifier les similitudes entre les clusters qui composent ces partitions, nous distinguons deux grandes familles de mesures pour lesquelles la notion même d'accord entre partitions diffère, et proposons d'en caractériser les représentants selon un même ensemble de propriétés formelles et informelles. De ce point de vue, les mesures sont aussi qualifiées selon la nature des partitions comparées. Une étude des multiples constructions sur lesquelles reposent les mesures de la littérature vient compléter notre taxonomie. Nous proposons trois nouvelles mesures de comparaison non strictes tirant profit de l'état de l'art. La première est une extension d'une approche stricte tandis que les deux autres reposent sur des approches dite natives, l'une orientée individus, l'autre orientée clusters, spécifiquement conçues pour la comparaison de partitions non strictes. Nos propositions sont comparées à celles de la littérature selon un plan d'expérience choisi pour couvrir les divers aspects de la problématique. Les résultats présentés montrent l'intérêt des propositions pour le thème de recherche qu'est la comparaison de partitions. Enfin, nous ouvrons de nouvelles perspectives en proposant les prémisses d'un cadre qui unifie les principales mesures non strictes orientées individus.

APA, Harvard, Vancouver, ISO, and other styles

39

Chang, Kung Wei, and 張恭維. "The Web Mining Framework Combining Association Rules And Fuzzy Clusters." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/65551909349006885595.

Full text

Abstract:

碩士
元智大學
資訊管理研究所
89
Lately, most studies have relied on statistic clustering techniques to analyze web user profile data in web mining. However, this approach can only sort each user session into a single cluster. That is, it ignores a user session may contain several browsing prefers. According to this insufficiency, fuzzy clustering techniques were proposed instead. But those methods only can use similarity score of session to calculate the similarity between pages. Therefore, if users browse the same web page by different paths, that causes wrong results. This research proposes a framework which combines the fuzzy clustering and association rules. This approach filters out the noisy data, and employs association rules to calculate the confidence of the rule as the association between different URL addresses. Finally, an improved fuzzy clustering is adopted, which replaces the similarity score of session with the confidence between pages, to found out the user prefers effectively.

APA, Harvard, Vancouver, ISO, and other styles

40

Yang, Cheng-Yen, and 楊政諺. "Fuzzy C-Means Hardware Architecture for Applications Having Large Number of Clusters." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/31417302659506133167.

Full text

Abstract:

碩士
國立臺灣師範大學
資訊工程研究所
98
This paper presents a novel low-cost and high-performance VLSI architecture for fuzzy c-means clustering. In the architecture, the operations at both the centroid and data levels are pipelined to attain high computational speed while consuming low hardware resources. In addition, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. Experimental results show that the proposed solution is an effective alternative for cluster analysis with low computational cost and high performance.

APA, Harvard, Vancouver, ISO, and other styles

41

Chu, Chih-Wen, and 朱志文. "Fuzzy Modeling and Control of Air-Conditioned Rooms with Clusters Split/Merge Algorithm." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/15857516124513949174.

Full text

Abstract:

碩士
大同大學
機械工程研究所
90
In air-conditioned room, the main factors that affect the human comfort are air velocity and temperature. To describe the relation among these state variables all over the air-conditioned room, the TS fuzzy models of real system are built by using data clustering algorithms. To increase the accuracy of models, the number and center of clusters are automatically and quickly adjusted according to certain criteria we proposed. That is, a fast and rough clustering is first performed by K-means algorithm. Then a clusters split/merge algorithm is applied which can automatically find suitable cluster centers for fuzzy clustering with Fuzzy c-means algorithm. Also, to demonstrate the feasibility of the clusters split/merge algorithm, the built fuzzy model of air-conditioned room is applied in various control approaches.

APA, Harvard, Vancouver, ISO, and other styles

42

Huang, Shang-Ming, and 黃上銘. "A survey on fuzzy clustering methods for datasets containing clusters with different shapes." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/03057198836442089975.

Full text

Abstract:

碩士
國立中興大學
資訊科學與工程學系
104
Data clustering techniques are used in many fields such as pattern recognition, image segmentation, statistical data analysis, data mining, big data. The process of dividing data or objects into different classes or groups according to certain criteria is called data clustering. In this thesis, we conduct a survey on fuzzy clustering methods specifically for partitioning clusters with different geometric shapes and investigate their respective performance on partitioning datasets comprising clusters with various geometric shapes. Generally speaking, two-dimensional shapes are recognized and described by the human much more easily. Thus, all experimental datasets including those used in the literature are made based on two-dimensional features of Cartesian coordinates. Experimental results show that, by modifying the update procedure of fuzzy partition matrix or distance definition of objective function of the fuzzy C means clustering method, the fuzzy C means variants can partition datasets into clusters with different geometric shapes quite accurately.

APA, Harvard, Vancouver, ISO, and other styles

43

Yang, Cheng-Hao, and 楊程皓. "An GA-based Fuzzy Clustering Algorithm with Interpretable Rules and Best-Fit Clusters." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/s69556.

Full text

Abstract:

碩士
國立臺灣科技大學
電子工程系
105
With the increasing popularity of Internet of Things, big data analysis becomes an important topic. By using multi-sensors devices, we can easily gather real life data and mine important information from them. These datasets are mostly high-dimensional data, and most of them are unlabeled. Therefore, reducing high dimensional data by using feature selection to choose important feature sets becomes an important topic in machine learning, especially in unsupervised learning. There are many kinds of clustering algorithms, such as k-means, hierarchical clustering, mean shift clustering, etc. Although we can get comparatively better result, we are still interested in “Which feature contributes to the result of clustering?” and “What is the correct number of clusters?” . In this paper, we propose a clustering algorithm not only finds significant and important features, but also proper number of clusters with clustering rules which human can easily interpret. Experimental results show that the proposed algorithm can perform well in the real-environment wine dataset.

APA, Harvard, Vancouver, ISO, and other styles

44

(14030507), Deepani B. Guruge. "Effective document clustering system for search engines." Thesis, 2008. https://figshare.com/articles/thesis/Effective_document_clustering_system_for_search_engines/21433218.

Full text

Abstract:

People use web search engines to fill a wide variety of navigational, informational and transactional needs. However, current major search engines on the web retrieve a large number of documents of which only a small fraction are relevant to the user query. The user then has to manually search for relevant documents by traversing a topic hierarchy, into which a collection is categorised. As more information becomes available, it becomes a time consuming task to search for required relevant information.

This research develops an effective tool, the web document clustering (WDC) system, to cluster, and then rank, the output data obtained from queries submitted to a search engine, into three pre-defined fuzzy clusters. Namely closely related, related and not related. Documents in closely related and related documents are ranked based on their context.

The WDC output has been compared against document clustering results from the Google, Vivisimo and Dogpile systems as these where considered the best at the fourth Search Engine Awards [24]. Test data was from standard document sets, such as the TREC-8 [118] data files and the Iris database [38], or 3 from test text retrieval tasks, "Latex", "Genetic Algorithms" and "Evolutionary Algorithms". Our proposed system had as good as, or better results, than that obtained by these other systems. We have shown that the proposed system can effectively and efficiently locate closely related, related and not related, documents among the retrieved document set for queries submitted to a search engine.

We developed a methodology to supply the user with a list of keywords filtered from the initial search result set to further refine the search. Again we tested our clustering results against the Google, Vivisimo and Dogpile systems. In all cases we have found that our WDC performs as well as, or better than these systems.

The contributions of this research are:

A post-retrieval fuzzy document clustering algorithm that groups documents into closely related, related and not related clusters. This algorithm uses modified fuzzy c-means (FCM) algorithm to cluter documents into predefined intelligent fuzzy clusters and this approach has not been used before.
The fuzzy WDC system satisfies the user's information need as far as possible by allowing the user to reformulate the initial query. The system prepares an initial word list by selecting a few characteristics terms of high frequency from the first twenty documents in the initial search engine output. The user is then able to use these terms to input a secondary query. The WDC system then creates a second word list, or the context of the user query (COQ), from the closely related documents to provide training data to refine the search. Documents containing words with high frequency from the training list, based on a pre-defined threshold value, are then presented to the user to refine the search by reformulating the query. In this way the context of the user query is built, enabling the user to learn from the keyword list. This approach is not available in current search engine technology.
A number of modifications were made to the FCM algorithm to improve its performance in web document clustering. A factor sw_kq is introduced into the membership function as a measure of the amount of overlaping between the components of the feature vector and the cluster prototype. As the FCM algorithm is greatly affected by the values used to initialise the components of cluster prototypes a machine learning approach, using an Evolutionary Algorithm, was used to resolve the initialisation problem.
Experimental results indicate that the WDC system outperformed Google, Dogpile and the Vivisimo search engines. The post-retrieval fuzzy web document clustering algorithm designed in this research improves the precision of web searches and it also contributes to the knowledge of document retrieval using fuzzy logic.
A relational data model was used to automatically store data output from the search engine off-line. This takes the processing of data of the Internet off-line, saving resources and making better use of the local CPU.
This algorithm uses Latent Semantic Indexing (LSI) to rank documents in the closely related and related clusters. Using LSI to rank document is wellknown, however, we are the first to apply it in the context of ranking closely related documents by using COQ to form the term x document matrix in LSI, to obtain better ranking results.
Adjustments based on document size are proposed for dealing with problems associated with varying document size in the retrieved documents and the effect this has on cluster analysis.

APA, Harvard, Vancouver, ISO, and other styles

45

Tai, Chia-Hung, and 戴嘉宏. "Fuzzy Cluster-Based Query Expansion." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/41760976903310141825.

Full text

Abstract:

碩士
國立中山大學
資訊管理學系研究所
92
Advances in information and network technologies have fostered the creation and availability of a vast amount of online information, typically in the form of text documents. Information retrieval (IR) pertains to determining the relevance between a user query and documents in the target collection, then returning those documents that are likely to satisfy the user’s information needs. One challenging issue in IR is word mismatch, which occurs when concepts can be described by different words in the user queries and/or documents. Query expansion is a promising approach for dealing with word mismatch in IR. In this thesis, we develop a fuzzy cluster-based query expansion technique to solve the word mismatch problem. Using existing expansion techniques (i.e., global analysis and non-fuzzy cluster-based query expansion) as performance benchmarks, our empirical results suggest that the fuzzy cluster-based query expansion technique can provide a more accurate query result than the benchmark techniques can.

APA, Harvard, Vancouver, ISO, and other styles

46

Tian, Yi-Cheng, and 田益誠. "Cluster-Weighted Fuzzy Clustering Algorithms." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/66377500446890557129.

Full text

Abstract:

博士
中原大學
應用數學研究所
103
Fuzzy clustering is generally extended from hard clustering based on fuzzy membership partitions. In fuzzy clustering, the fuzzy c-means (FCM) algorithm is the most well-known clustering method. Up to now, there are various generalizations of FCM. However, the FCM algorithm and its generalizations are always affected by initializations. In this paper, we consider a cluster-weighted term with an updating equation to adjust the effects of initializations to fuzzy clustering algorithms. We first propose the so-called cluster-weighted fuzzy clustering of the generalized FCM (GFCM). We then construct the cluster-weighted FCM, cluster-weighted Gustafson and Kessel (GK) and cluster-weighted inter-cluster separation (ICS) algorithms. Some numerical examples are used to compare our cluster-weighted fuzzy clustering with the fuzzy clustering algorithms. We also apply the cluster-weighted fuzzy clustering algorithms to real data sets. The results demonstrate the superiority and usefulness of our proposed cluster-weighted fuzzy clustering methods.

APA, Harvard, Vancouver, ISO, and other styles

47

CHEN, JIAN-LIANG, and 陳建良. "study of application of fuzzy cluster and fuzzy discriminant analysis." Thesis, 1992. http://ndltd.ncl.edu.tw/handle/78498594847952053296.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

譚嘉慧. "On Cluster Validity for Fuzzy Clustering." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/78925159856103876482.

Full text

Abstract:

碩士
中原大學
數學系
88
ABSTRACT Before dealing with the data set , we can partition a data set into any specified number of cluster by fuzzy c mean (FCM) algorithm in which the data points assigned to the same cluster are more similar to each other than data points belonging to different clusters . When we use the FCM algorithm , we need to presume the number of cluster first in the algorithm . However , c is usually unknown .Thus the estimation of c becomes the important problem . This problem is called cluster validity . A cluster validity index is used as a measure of reliability reliability when the optimal number of clusters indicated by a validity index equal to the true number of clusters of a data set . Many cluster validity indexes such as partition coefficient (PC) , partition entropy (PE) , etc. have been proposed. In this paper , we propose a new cluster validity index called WB index which combines membership degrees and geometrical properties of data . Then we compare the numerical result of this new index with a number of known cluster validity indexes . The obtained results indicate that the proposed WB index provides better rules than the other cluster validity index . Keyword : fuzzy clustering , fuzzy c mean (FCM) algorithm , cluster validity , validity index ,

APA, Harvard, Vancouver, ISO, and other styles

49

Pan, Jinn Anne, and 潘進安. "On Cluster Fuzzy for Directional Data." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/87400694610057115593.

Full text

Abstract:

碩士
中原大學
應用數學研究所
83
Since Von Mises (1918): introduced a distribution on dire -ctional data and the Great statistician R.A. Fisher (1953) proposed an important result "Dispersion on a sphere", the statistical methods about directional data have been widely studied and applied in a variety of substantive area.Mardia's book (1972) "Statistic of Directional Data" adn N.I. Fisher's book (1993) "Statistical Analysis of Circular Date" gave a good survey and also described its applications. Mixtures of didtributions were always used as models in a wide variety of important practical situations. These also have applications to clustering. Mixtures of Von Mises distributions are importatn models of directional data. Spurr & Koutbeiy (1991) gave a good comparison of various methods for estimating the parameters in mixtures of Von Mises distributions. In this project, we plan to apply fuzzy classification maximum likelihood procedures propused by Yany (1993)to derive some fuzzy clustering algorithms of directional data. Based on new derived algorithms, we deal with the parameter estimation of mixtures of Von Mises distributions. According to some prelimiary studies, we get some good results. Therefore, we plan to study the following subjects in this project: 1. Derive fuzzy clustering algorithms for directional data. 2. Construct methods about the parameter estimation of mixtures of Von Mises distributions and make a comparison with other estimation methods. 3. Investigate fuzzy cluster-wise regression analysis on directional data.

APA, Harvard, Vancouver, ISO, and other styles

50

Ko, Cheng Hsiu, and 柯政秀. "On Cluster-Wise Fuzzy Regression Analysis." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/42995305129031859293.

Full text

Abstract:

碩士
中原大學
應用數學研究所
82
Since Tanaka et al. proposed a study in linear regression analysis with fuzzy model, fuzzy regression analysis has been widely studied and applied in a variety of substantive areas. We know that the regression analysis in the case of heterogeneity of observations are commonly presented in practice. In this paper, the main goal is to apply fuzzy clustering techniques to fuzzy regression analysis. The fuzzy clustering is used to overcome the heterogeneous problem in fuzzy regression model. We combine both together and call it the cluster-wise fuzzy regression analysis.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Fuzzy clusters'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles