Dissertations / Theses on the topic 'HYBRID CLUSTERING'

To see the other types of publications on this topic, follow the link: HYBRID CLUSTERING.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'HYBRID CLUSTERING.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Keller, Jens. "Clustering biological data using a hybrid approach : Composition of clusterings from different features." Thesis, University of Skövde, School of Humanities and Informatics, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-1078.

Full text
Abstract:

Clustering of data is a well-researched topic in computer sciences. Many approaches have been designed for different tasks. In biology many of these approaches are hierarchical and the result is usually represented in dendrograms, e.g. phylogenetic trees. However, many non-hierarchical clustering algorithms are also well-established in biology. The approach in this thesis is based on such common algorithms. The algorithm which was implemented as part of this thesis uses a non-hierarchical graph clustering algorithm to compute a hierarchical clustering in a top-down fashion. It performs the graph clustering iteratively, with a previously computed cluster as input set. The innovation is that it focuses on another feature of the data in each step and clusters the data according to this feature. Common hierarchical approaches cluster e.g. in biology, a set of genes according to the similarity of their sequences. The clustering then reflects a partitioning of the genes according to their sequence similarity. The approach introduced in this thesis uses many features of the same objects. These features can be various, in biology for instance similarities of the sequences, of gene expression or of motif occurences in the promoter region. As part of this thesis not only the algorithm itself was implemented and evaluated, but a whole software also providing a graphical user interface. The software was implemented as a framework providing the basic functionality with the algorithm as a plug-in extending the framework. The software is meant to be extended in the future, integrating a set of algorithms and analysis tools related to the process of clustering and analysing data not necessarily related to biology.

The thesis deals with topics in biology, data mining and software engineering and is divided into six chapters. The first chapter gives an introduction to the task and the biological background. It gives an overview of common clustering approaches and explains the differences between them. Chapter two shows the idea behind the new clustering approach and points out differences and similarities between it and common clustering approaches. The third chapter discusses the aspects concerning the software, including the algorithm. It illustrates the architecture and analyses the clustering algorithm. After the implementation the software was evaluated, which is described in the fourth chapter, pointing out observations made due to the use of the new algorithm. Furthermore this chapter discusses differences and similarities to related clustering algorithms and software. The thesis ends with the last two chapters, namely conclusions and suggestions for future work. Readers who are interested in repeating the experiments which were made as part of this thesis can contact the author via e-mail, to get the relevant data for the evaluation, scripts or source code.

APA, Harvard, Vancouver, ISO, and other styles
2

Tyree, Eric William. "A hybrid methodology for data clustering." Thesis, City University London, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301057.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Moore, Garrett Lee. "A Hybrid (Active-Passive) VANET Clustering Technique." Diss., NSUWorks, 2019. https://nsuworks.nova.edu/gscis_etd/1077.

Full text
Abstract:
Clustering serves a vital role in the operation of Vehicular Ad hoc Networks (VANETs) by continually grouping highly mobile vehicles into logical hierarchical structures. These moving clusters support Intelligent Transport Systems (ITS) applications and message routing by establishing a more stable global topology. Clustering increases scalability of the VANET by eliminating broadcast storms caused by packet flooding and facilitate multi-channel operation. Clustering techniques are partitioned in research into two categories: active and passive. Active techniques rely on periodic beacon messages from all vehicles containing location, velocity, and direction information. However, in areas of high vehicle density, congestion may occur on the long-range channel used for beacon messages limiting the scale of the VANET. Passive techniques use embedded information in the packet headers of existing traffic to perform clustering. In this method, vehicles not transmitting traffic may cause cluster heads to contain stale and malformed clusters. This dissertation presents a hybrid active/passive clustering technique, where the passive technique is used as a congestion control strategy for areas where congestion is detected in the network. In this case, cluster members halt their periodic beacon messages and utilize embedded position information in the header to update the cluster head of their position. This work demonstrated through simulation that the hybrid technique reduced/eliminated the delays caused by congestion in the modified Distributed Coordination Function (DCF) process, thus increasing the scalability of VANETs in urban environments. Packet loss and delays caused by the hidden terminal problem was limited to distant, non-clustered vehicles. This dissertation report presents a literature review, methodology, results, analysis, and conclusion.
APA, Harvard, Vancouver, ISO, and other styles
4

Gurcan, Fatih. "A Hybrid Movie Recommender Using Dynamic Fuzzy Clustering." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/2/12611667/index.pdf.

Full text
Abstract:
Recommender systems are information retrieval tools helping users in their information seeking tasks and guiding them in a large space of possible options. Many hybrid recommender systems are proposed so far to overcome shortcomings born of pure content-based (PCB) and pure collaborative filtering (PCF) systems. Most studies on recommender systems aim to improve the accuracy and efficiency of predictions. In this thesis, we propose an online hybrid recommender strategy (CBCFdfc) based on content boosted collaborative filtering algorithm which aims to improve the prediction accuracy and efficiency. CBCFdfc combines content-based and collaborative characteristics to solve problems like sparsity, new item and over-specialization. CBCFdfc uses fuzzy clustering to keep a certain level of prediction accuracy while decreasing online prediction time. We compare CBCFdfc with PCB and PCF according to prediction accuracy metrics, and with CBCFonl (online CBCF without clustering) according to online recommendation time. Test results showed that CBCFdfc performs better than other approaches in most cases. We, also, evaluate the effect of user-specified parameters to the prediction accuracy and efficiency. According to test results, we determine optimal values for these parameters. In addition to experiments made on simulated data, we also perform a user study and evaluate opinions of users about recommended movies. The results that are obtained in user evaluation are satisfactory. As a result, the proposed system can be regarded as an accurate and efficient hybrid online movie recommender.
APA, Harvard, Vancouver, ISO, and other styles
5

Tantrum, Jeremy. "Model based and hybrid clustering of large datasets /." Thesis, Connect to this title online; UW restricted, 2003. http://hdl.handle.net/1773/8933.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Garbiso, Julian Pedro. "Fair auto-adaptive clustering for hybrid vehicular networks." Thesis, Paris, ENST, 2017. http://www.theses.fr/2017ENST0061/document.

Full text
Abstract:
Dans le cadre du développement des innovations dans les Systèmes de Transport Intelligents, les véhicules connectés devront être capables de télécharger des informations basées sur la position sur et depuis des serveurs distants. Ces véhicules seront équipés avec des différentes technologies d’accès radio, telles que les réseaux cellulaires ou les réseaux véhicule-à-véhicule (V2V) comme IEEE 802.11p. Les réseaux cellulaires, avec une couverture presque omniprésente, fournissent un accès à internet avec garanties de qualité de service. Cependant, l’accès à ces réseaux est payant. Dans cette thèse, un algorithme de clustering multi-saut est proposé avec pour objectif de réduire le coût d’accès au réseau cellulaire en agrégeant des données sur le réseau V2V. Pour faire ceci, le leader du cluster (CH, de l’anglais Cluster Head) est utilisé comme passerelle unique vers le réseau cellulaire. Pour le test d’une application d’exemple pour télécharger du Floating Car Data agrégé, les résultats des simulations montrent que cette approche réduit l’utilisation du réseau cellulaire de plus de 80%, en s’attaquant à la redondance typique des données basées sur la position dans les réseaux véhiculaires. Il y a une contribution en trois parties : Premièrement, une approche pour déléguer la sélection du CH à la station de base du réseau cellulaire afin de maximiser la taille des clusters, et par conséquent le taux de compression. Deuxièmement, un algorithme auto-adaptatif qui change dynamiquement le nombre maximum de sauts afin de maintenir un équilibre entre la réduction des coûts d’accès au réseau cellulaire et le taux de perte de paquets dans le réseau V2V. Finalement, l’incorporation d’une théorie de la justice distributive, afin d’améliorer l’équité sur la durée concernant la distribution des coûts auxquels les CH doivent faire face, améliorant ainsi l’acceptabilité sociale de la proposition. Les algorithmes proposés ont été testés via simulation, et les résultats montrent une réduction significative dans l’utilisation du réseau cellulaire, une adaptation réussie du nombre de sauts aux changements de la densité du trafic véhiculaire, et une amélioration dans les métriques d’équité, sans affecter la performance des réseaux
For the development of innovative Intelligent Transportation Systems applications, connected vehicles will frequently need to upload and download position-based information to and from servers. These vehicles will be equipped with different Radio Access Technologies (RAT), like cellular and vehicle-to-vehicle (V2V) technologies such as LTE and IEEE 802.11p respectively. Cellular networkscan provide internet access almost anywhere, with QoS guarantees. However, accessing these networks has an economic cost. In this thesis, a multi-hop clustering algorithm is proposed in the aim of reducing the cellular access costs by aggregating information and off-loading data in the V2V network, using the Cluster Head as a single gateway to the cellular network. For the example application of uploading aggregated Floating Car Data, simulation results show that this approach reduce cellular data consumption by more than 80% by reducing the typical redundancy of position-based data in a vehicular network. There is a threefold contribution: First, an approach that delegates the Cluster Head selection to the cellular base station in order to maximize the cluster size, thus maximizing aggregation. Secondly, a self-adaptation algorithm that dynamically changes the maximum number of hops, addressing the trade-off between cellular access reduction and V2V packet loss. Finally, the incorporation of a theory of distributive justice, for improving fairness over time regarding the distribution of the cost in which Cluster Heads have to incur, thus improving the proposal’s social acceptability. The proposed algorithms were tested via simulation, and the results show a significant reduction in cellular network usage, a successful adaptation of the number of hops to changes in the vehicular traffic density, and an improvement in fairness metrics, without affecting network performance
APA, Harvard, Vancouver, ISO, and other styles
7

Javed, Ali. "A Hybrid Approach to Semantic Hashtag Clustering in Social Media." ScholarWorks @ UVM, 2016. http://scholarworks.uvm.edu/graddis/623.

Full text
Abstract:
The uncontrolled usage of hashtags in social media makes them vary a lot in the quality of semantics and the frequency of usage. Such variations pose a challenge to the current approaches which capitalize on either the lexical semantics of a hashtag by using metadata or the contextual semantics of a hashtag by using the texts associated with a hashtag. This thesis presents a hybrid approach to clustering hashtags based on their semantics, designed in two phases. The first phase is a sense-level metadata-based semantic clustering algorithm that has the ability to differentiate among distinct senses of a hashtag as opposed to the hashtag word itself. The gold standard test demonstrates that sense-level clusters are significantly more accurate than word-level clusters. The second phase is a hybrid semantic clustering algorithm using a consensus clustering approach which finds the consensus between metadata-based sense-level semantic clusters and text-based semantic clusters. The gold standard test shows that the hybrid algorithm outperforms both the text-based algorithm and the metadata-based algorithm for a majority of ground truths tested and that it never underperforms both baseline algorithms. In addition, a larger-scale performance study, conducted with a focus on disagreements in cluster assignments between algorithms, shows that the hybrid algorithm makes the correct cluster assignment in a majority of disagreement cases.
APA, Harvard, Vancouver, ISO, and other styles
8

GARRAFFA, MICHELE. "Exact and Heuristic Hybrid Approaches for Scheduling and Clustering Problems." Doctoral thesis, Politecnico di Torino, 2016. http://hdl.handle.net/11583/2639115.

Full text
Abstract:
This thesis deals with the design of exact and heuristic algorithms for scheduling and clustering combinatorial optimization problems. All the works are linked by the fact that all the presented methods arebasically hybrid algorithms, that mix techniques used in the world of combinatorial optimization. The algorithms are all efficient in practice, but the one presented in Chapter 4, that has mostly theoretical interest. Chapter 2 presents practical solution algorithms based on an ILP model for an energy scheduling combinatorial problem that arises in a smart building context. Chapter 3 presents a new cutting stock problem and introduce a mathematical formulation and a heuristic solution approach based on a heuristic column generation scheme. Chapter 4 provides an exact exponential algorithm, whose importance is only theoretical so far, for a classical scheduling problem: the Single Machine Total Tardiness Problem. The relevant aspect is that the designed algorithm has the best worst case complexity for the problem, that has been studied for several decades. Furthermore, such result is based on a new technique, called Branch and Merge, that avoids the solution of several equivalent sub-problems in a branching algorithm that requires polynomial space. As a consequence, such technique embeds in a branching algorithm ideas coming from other traditional computer science techniques such as dynamic programming and memorization, but keeping the space requirement polynomial. Chapter 5 provides an exact approach based on semidefinite programming and a matheuristic approach based on a quadratic solver for a fractional clustering combinatorial optimization problem, called Max-Mean Dispersion Problem. The matheuristic approach has the peculiarity of using a non-linear MIP solver. The proposed exact approach uses a general semidefinite programming relaxation and it is likely to be extended to other combinatorial problems with a fractional formulation. Chapter 6 proposes practical solution methods for a real world clustering problem arising in a smart city context. The solution algorithm is based on the solution of a Set Cover model via a commercial ILP solver. As a conclusion, the main contribution of this thesis is given by several approaches of practical or theoretical interest, for two classes of important combinatorial problems: clustering and scheduling. All the practical methods presented in the thesis are validated by extensive computational experiments, that compare the proposed methods with the ones available in the state of the art.
APA, Harvard, Vancouver, ISO, and other styles
9

Masoudi, Pedram. "Application of hybrid uncertainty-clustering approach in pre-processing well-logs." Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S023/document.

Full text
Abstract:
La thèse est principalement centrée sur l'étude de la résolution verticale des diagraphies. On outre, l'arithmétique floue est appliquée aux modèles expérimentaux pétrophysiques en vue de transmettre l'incertitude des données d'entrée aux données de sortie, ici la saturation irréductible en eau et la perméabilité. Les diagraphies sont des signaux digitaux dont les données sont des mesures volumétriques. Le mécanisme d'enregistrement de ces données est modélisé par des fonctions d'appartenance floues. On a montré que la Résolution Verticale de la Fonction d'Appartenance (VRmf) est supérieur d'espacement. Dans l'étape suivante, la fréquence de Nyquist est revue en fonction du mécanisme volumétrique de diagraphie ; de ce fait, la fréquence volumétrique de Nyquist est proposée afin d'analyser la précision des diagraphies. Basé sur le modèle de résolution verticale développée, un simulateur géométrique est conçu pour générer les registres synthétiques d'une seule couche mince. Le simulateur nous permet d'analyser la sensibilité des diagraphies en présence d'une couche mince. Les relations de régression entre les registres idéaux (données d'entrée de ce simulateur) et les registres synthétiques (données de sortie de ce simulateur) sont utilisées comme relations de déconvolution en vue d'enlever l'effet des épaules de couche d'une couche mince sur les diagraphies GR, RHOB et NPHI. Les relations de déconvolution ont bien été appliquées aux diagraphies pour caractériser les couches minces. Par exemple, pour caractériser une couche mince poreuse, on a eu recours aux données de carottage qui étaient disponibles pour la vérification : NPHI mesuré (3.8%) a été remplacé (corrigé) par 11.7%. NPHI corrigé semble être plus précis que NPHI mesuré, car la diagraphie a une valeur plus grande que la porosité de carottage (8.4%). Il convient de rappeler que la porosité totale (NPHI) ne doit pas être inférieure à la porosité effective (carottage). En plus, l'épaisseur de la couche mince a été estimée à 13±7.5 cm, compatible avec l'épaisseur de la couche mince dans la boite de carottage (<25 cm). Normalement, l'épaisseur in situ est inférieure à l'épaisseur de la boite de carottage, parce que les carottes obtenues ne sont plus soumises à la pression lithostatique, et s'érodent à la surface du sol. La DST est appliquée aux diagraphies, et l'intervalle d'incertitude de DST est construit. Tandis que la VRmf des diagraphies GR, RHOB, NPHI et DT est ~60 cm, la VRmf de l'intervalle d'incertitude est ~15 cm. Or, on a perdu l'incertitude de la valeur de diagraphie, alors que la VRmf est devenue plus précise. Les diagraphies ont été ensuite corrigées entre l'intervalle d'incertitude de DST avec quatre simulateurs. Les hautes fréquences sont amplifiées dans les diagraphies corrigées, et l'effet des épaules de couche est réduit. La méthode proposée est vérifiée dans les cas synthétiques, la boite de carottage et la porosité de carotte. L'analyse de partitionnement est appliquée aux diagraphies NPHI, RHOB et DT en vue de trouver l'intervalle d'incertitude, basé sur les grappes. Puis, le NPHI est calibré par la porosité de carottes dans chaque grappe. Le √MSE de NPHI calibré est plus bas par rapport aux cinq modèles conventionnels d'estimation de la porosité (au minimum 33% d'amélioration du √MSE). Le √MSE de généralisation de la méthode proposée entre les puits voisins est augmenté de 42%. L'intervalle d'incertitude de la porosité est exprimé par les nombres flous. L'arithmétique floue est ensuite appliquée dans le but de calculer les nombres flous de la saturation irréductible en eau et de la perméabilité. Le nombre flou de la saturation irréductible en eau apporte de meilleurs résultats en termes de moindre sous-estimation par rapport à l'estimation nette. Il est constaté que lorsque les intervalles de grappes de porosité ne sont pas compatibles avec la porosité de carotte, les nombres flous de la perméabilité ne sont pas valables
In the subsurface geology, characterization of geological beds by well-logs is an uncertain task. The thesis mainly concerns studying vertical resolution of well-logs (question 1). In the second stage, fuzzy arithmetic is applied to experimental petrophysical relations to project the uncertainty range of the inputs to the outputs, here irreducible water saturation and permeability (question 2). Regarding the first question, the logging mechanism is modelled by fuzzy membership functions. Vertical resolution of membership function (VRmf) is larger than spacing and sampling rate. Due to volumetric mechanism of logging, volumetric Nyquist frequency is proposed. Developing a geometric simulator for generating synthetic-logs of a single thin-bed enabled us analysing sensitivity of the well-logs to the presence of a thin-bed. Regression-based relations between ideal-logs (simulator inputs) and synthetic-logs (simulator outputs) are used as deconvolution relations for removing shoulder-bed effect of thin-beds from GR, RHOB and NPHI well-logs. NPHI deconvolution relation is applied to a real case where the core porosity of a thin-bed is 8.4%. The NPHI well-log is 3.8%, and the deconvolved NPHI is 11.7%. Since it is not reasonable that the core porosity (effective porosity) be higher than the NPHI (total porosity), the deconvolved NPHI is more accurate than the NPHI well-log. It reveals that the shoulder-bed effect is reduced in this case. The thickness of the same thin-bed was also estimated to be 13±7.5 cm, which is compatible with the thickness of the thin-bed in the core box (<25 cm). Usually, in situ thickness is less than the thickness of the core boxes, since at the earth surface, there is no overburden pressure, also the cores are weathered. Dempster-Shafer Theory (DST) was used to create well-log uncertainty range. While the VRmf of the well-logs is more than 60 cm, the VRmf of the belief and plausibility functions (boundaries of the uncertainty range) would be about 15 cm. So, the VRmf is improved, while the certainty of the well-log value is lost. In comparison with geometric method, DST-based algorithm resulted in a smaller uncertainty range of GR, RHOB and NPHI logs by 100%, 71% and 66%, respectively. In the next step, cluster analysis is applied to NPHI, RHOB and DT for the purpose of providing cluster-based uncertainty range. Then, NPHI is calibrated by core porosity value in each cluster, showing low √MSE compared to the five conventional porosity estimation models (at least 33% of improvement in √MSE). Then, fuzzy arithmetic is applied to calculate fuzzy numbers of irreducible water saturation and permeability. Fuzzy number of irreducible water saturation provides better (less overestimation) results than the crisp estimation. It is found that when the cluster interval of porosity is not compatible with the core porosity, the permeability fuzzy numbers are not valid, e.g. in well#4. Finally, in the possibilistic approach (the fuzzy theory), by calibrating α-cut, the right uncertainty interval could be achieved, concerning the scale of the study
APA, Harvard, Vancouver, ISO, and other styles
10

Hung, Chih-Li. "An adaptive SOM model for document clustering using hybrid neural techniques." Thesis, University of Sunderland, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.400460.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

GRIBEL, DANIEL LEMES. "HYBRID GENETIC ALGORITHM FOR THE MINIMUM SUM-OF-SQUARES CLUSTERING PROBLEM." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2017. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=30724@1.

Full text
Abstract:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
Clusterização desempenha um papel importante em data mining, sendo útil em muitas áreas que lidam com a análise exploratória de dados, tais como recuperação de informações, extração de documentos e segmentação de imagens. Embora sejam essenciais em aplicações de data mining, a maioria dos algoritmos de clusterização são métodos ad-hoc. Eles carecem de garantias na qualidade da solução, que em muitos casos está relacionada a uma convergência prematura para um mínimo local no espaço de busca. Neste trabalho, abordamos o problema de clusterização a partir da perspectiva de otimização, onde propomos um algoritmo genético híbrido para resolver o problema Minimum Sum-of-Squares Clustering (MSSC, em inglês). A meta-heurística proposta é capaz de escapar de mínimos locais e gerar soluções quase ótimas para o problema MSSC. Os resultados mostram que o método proposto superou os resultados atuais da literatura – em termos de qualidade da solução – para quase todos os conjuntos de instâncias considerados para o problema MSSC.
Clustering plays an important role in data mining, being useful in many fields that deal with exploratory data analysis, such as information retrieval, document extraction, and image segmentation. Although they are essential in data mining applications, most clustering algorithms are adhoc methods. They have a lack of guarantee on the solution quality, which in many cases is related to a premature convergence to a local minimum of the search space. In this research, we address the problem of data clustering from an optimization perspective, where we propose a hybrid genetic algorithm to solve the Minimum Sum-of-Squares Clustering (MSSC) problem. This meta-heuristic is capable of escaping from local minima and generating near-optimal solutions to the MSSC problem. Results show that the proposed method outperformed the best current literature results - in terms of solution quality - for almost all considered sets of benchmark instances for the MSSC objective.
APA, Harvard, Vancouver, ISO, and other styles
12

Goncalves, Contente Francisco. "Hierarchical Clustering based Dynamic Subarrays for Hybrid Beamforming Massive MU-MIMO." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285515.

Full text
Abstract:
Hybrid Beamforming in multiple input multiple output systems has surged in the past few years as the prime precoding architecture for massive MIMO systems since it addresses the systems performance and complexity/power consumption tradeoff. Most of the literature addresses HBF MIMO with a fixed antenna subarray, however it is possible to further increase the spectral efficiency of the system by dynamically configuring the HBF MIMO subarray according the channel state information of the users. In this study it is shown that finding the subarray configuration which maximizes the largest singular values of the channel covariance matrix is a good near-optimal approach to find the mapping which provides the best system performance. With this discovery, two dynamic mapping algorithms, based on hierarchical clustering analysis, are presented. Since both algorithms provide marginally identical performance, the least computational expensive one was chosen to run an extensive study to understand under which conditions the dynamic subarray should be used and how the channel characteristics and the system parameters can affect the potential gain.To implement this new dynamic subarray architecture, it is necessary to add a switch network to the HBF MIMO. Therefore, to reduce the complexity of the network, and consequently its insertion losses, some constrained variants of the dynamic mapping algorithm are proposed.Simulation results show that in some particularly challenging scenarios the dynamic subarray can provide up to 100% performance increase, when compared with the fixed architecture, in the low SNR region and for high SNR more than 50% sum rate increase. Results also indicate that in the long term, for various scenarios, the dynamic subarray is expected to show significant performance gain.
Hybridbeamforming i flerantennsystem (MIMO-system) har under de senaste åren seglat upp som den främsta förkodningsarkitekturen för massiva MIMO-system- Detta eftersom den hanterar avvägningen mellan systemprestanda, komplexitet och kraftförbrukning. En stor del av litteraturen behandlar HBF MIMO med fixerad antennallokering, men det är möjligt att ytterligare öka spektraleffektiviteten hos systemet genom att dynamiskt konfigurera HBF MIMO-subarrayerna baserat på kanalstatusinformationen för användarna. I denna studie visas att subarray-konfigurationen, som maximerar de största singulärvärdena för en kanalkovariansmatrisen, är ett bra tillvägagångssätt i jämförelse med andra alternativ för att hitta den allokering som ger bäst systemprestanda. Med upptäckten i studien, presenteras två dynamiska allokeringsalgoritmer, baserade på hierarkisk klusteranalys. Eftersom båda algoritmerna ger marginellt identisk prestanda, valdes algoritmen med lägst beräknings för att genomföra en utförlig studie för att förstå under vilka förhållanden den dynamiska subarray-konfigurationen bör användas, samt hur kanalkaraktäristiken och systemparametrar kan påverka den potentiella vinsten.För att implementera denna nya dynamiska subarray-arkitektur är det nödvändigt att lägga till ett switchnätverk till HBF MIMO. För att minska nätverkets komplexitet och därmed minska inkopplingsförlusterna, föreslås några begränsade varianter av den dynamiska allokeringsalgoritmen.Simuleringsresultat visar att den dynamiska subarray kan ge upp till 100% förhöjd prestanda i vissa särskilt utmanande scenarier, jämfört med den fasta arkitekturen, i det låga SNR-området och för det höga SNR-området en prestandaökning med mer än 50%. Resultaten indikerar att på lång sikt, för olika scenarier, förväntas dynamiska subarrayer ge en betydande prestandaökning.
APA, Harvard, Vancouver, ISO, and other styles
13

Chen, Alvin Yun-Wen. "Peer clustering a hybrid architecture for massively scaled distributed virtual environments /." Diss., Restricted to subscribing institutions, 2007. http://proquest.umi.com/pqdweb?did=1472132491&sid=1&Fmt=2&clientId=1564&RQT=309&VName=PQD.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Kannamareddy, Aruna Sai. "Density and partition based clustering on massive threshold bounded data sets." Kansas State University, 2017. http://hdl.handle.net/2097/35467.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
William H. Hsu
The project explores the possibility of increasing efficiency in the clusters formed out of massive data sets which are formed using threshold blocking algorithm. Clusters thus formed are denser and qualitative. Clusters that are formed out of individual clustering algorithms alone, do not necessarily eliminate outliers and the clusters generated can be complex, or improperly distributed over the data set. The threshold blocking algorithm, a current research paper from Michael Higgins of Statistics Department on other hand, in comparison with existing algorithms performs better in forming the dense and distinctive units with predefined threshold. Developing a hybridized algorithm by implementing the existing clustering algorithms to re-cluster these units thus formed is part of this project. Clustering on the seeds thus formed from threshold blocking Algorithm, eases the task of clustering to the existing algorithm by eliminating the overhead of worrying about the outliers. Also, the clusters thus generated are more representative of the whole. Also, since the threshold blocking algorithm is proven to be fast and efficient, we now can predict a lot more decisions from large data sets in less time. Predicting the similar songs from Million Song Data Set using such a hybridized algorithm is considered as the data set for the evaluation of this goal.
APA, Harvard, Vancouver, ISO, and other styles
15

Baburam, Arun. "Adaptive mobility based clustering and hybrid geographic routing for mobile ad hoc networks." Thesis, University of Sussex, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.436822.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Tour, Samir R. "Parallel Hybrid Clustering using Genetic Programming and Multi-Objective Fitness with Density(PYRAMID)." NSUWorks, 2006. http://nsuworks.nova.edu/gscis_etd/886.

Full text
Abstract:
Clustering is the art of locating patterns in large data sets. It is an active research area that provides value to scientific as well as business applications. There are some challenges that face practical clustering including: identifying clusters of arbitrary shapes, sensitivity to the order of input, dynamic determination of the number of clusters, outlier handling, high dependency on user-defined parameters, processing speed of massive data sets, and the potential to fall into sub-optimal solutions. Many studies that were conducted in the realm of clustering have addressed some of these challenges. This study proposes a new approach, called parallel hybrid clustering using genetic programming and multi-objective fitness with density (PYRAMID), that tackles several of these challenges from a different perspective. PYRAMID employs genetic programming to represent arbitrary cluster shapes and circumvent falling in local optima. It accommodates large data sets and avoids dependency on the order of input by quantizing the data space, i.e., the space on which the data set resides, thus abstracting it into hyper-rectangular cells and creating genetic programming individuals as concatenations of these cells. Thus the cells become the subject of clustering, rather than the data points themselves. PYRAMID also utilizes a density-based multi-objective fitness function to handle outliers. It gathers statistics in a pre-processing step and uses them so not to rely on user-defined parameters. Finally, PYRAMID employs data parallelism in a master-slave model in an attempt to cure the inherent slow performance of evolutionary algorithms and provide speedup. A master processor distributes the clustering data evenly onto multiple slave processors. The slave processors conduct the clustering on their local data sets and report their clustering results back to the master, which consolidates them by merging the partial results into a final clustering solution. This last step also involves determining the number of clusters dynamically and labeling them accordingly. Experiments have demonstrated that, using these features, PYRAMID offers an advantage over some of the existing approaches by tackling the clustering challenges from a different angle.
APA, Harvard, Vancouver, ISO, and other styles
17

Dayananda, Karanam Ravichandran. "Zone Based Hybrid Approach for Clustering and Data Collection in Wireless Sensor Networks." Thesis, North Dakota State University, 2018. https://hdl.handle.net/10365/28738.

Full text
Abstract:
A wireless sensor network (WSN) is a collection of spatially distributed autonomous sensor nodes that can be used to monitor, among other things, environmental conditions. WSN nodes are constrained by their limited energy supply, communication range and local computational capabilities. Data routing is an area that can be optimized to allow nodes to conserve energy, improving the network?s overall lifetime. Though many routing protocols can be used, using a clustering protocol can play an important role in conserving WSN energy. A new hybrid algorithm is proposed which incorporates both distributed and centralized algorithms for selection of the cluster head (CH). In most networks, sensor nodes have limited energy, so a mobile data collector (MDC) is used to collect information, reducing energy requirements. The performance of proposed algorithm is evaluated using NS-2 simulations. The results show that proposed algorithm has better performance, throughput, network lifetime compared to existing routing protocols.
APA, Harvard, Vancouver, ISO, and other styles
18

Ali, Klaib Alhadi. "Clustering-based labelling scheme : a hybrid approach for efficient querying and updating XML documents." Thesis, University of Huddersfield, 2018. http://eprints.hud.ac.uk/id/eprint/34580/.

Full text
Abstract:
Extensible Markup Language (XML) has become a dominant technology for transferring data through the worldwide web. The XML labelling schemes play a key role in handling XML data efficiently and robustly. Thus, many labelling schemes have been proposed. However, these labelling schemes have limitations and shortcomings. Thus, the aim of this research was to investigate the existing XML labelling schemes and their limitations in order to address the issue of efficiency of XML query performance. This thesis investigated the existing labelling schemes and classified them into three categories based on certain criteria, in order to identify the limitations and challenges of these labelling schemes. Based on the outcomes of this investigation, this thesis proposed a state-of-theart labelling scheme, called clustering-based labelling scheme, to resolve or improve the key limitations such as the efficiency of the XML query processing, labelling XML nodes, and XML updates cost. This thesis argued that using certain existing labelling schemes to label nodes, and using the clustering-based techniques can improve query and labelling nodes efficiency. Theoretically, the proposed scheme is based on dividing the nodes of an XML document into clusters. Two existing labelling schemes, which are the Dewey and LLS labelling schemes, were selected for labelling these clusters and their nodes. Subsequently, the proposed scheme was designed and implemented. In addition, the Dewey and LLS labelling scheme were implemented for the purpose of evaluating the proposed scheme. Subsequently, four experiments were designed in order to test the proposed scheme against the Dewey and LLS labelling schemes. The results of these experiments suggest that the proposed scheme achieved better results than the Dewey and LLS schemes. Consequently, the research hypothesis was accepted overall with few exceptions, and the proposed scheme showed an improvement in the performance and all the targeted features and aspects.
APA, Harvard, Vancouver, ISO, and other styles
19

Alzahrani, Khalid Mohammed. "Perspectives on Hybrid Electric Vehicles in the Kingdom Of Saudi Arabia." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-dissertations/305.

Full text
Abstract:
"To satisfy the global energy demand while accommodating the rapidly increasing consumption rate in its domestic market, Saudi Arabia must develop and implement fuel efficiency programs in many sectors. Since transportation is a major contributor to fuel consumption and emission levels, introducing Hybrid Electric Vehicles (HEV) provides a viable solution to mitigate the current problems. However, existing studies on the diffusion of innovative vehicle technologies as well as on the understanding of the vehicle ownership and consumer behavior in Saudi Arabia are sparse. To fill this knowledge gap, I have aimed at developing an in-depth knowledgebase about general vehicle ownership and HEV ownership potential in particular for Saudi Arabia in my dissertation. I have achieved the research goal through a comprehensive online questionnaire that contains three different perspectives with each contributing a chapter in my dissertation. The first perspective provides a general understanding of the vehicle owners’ behaviors by analyzing over 600 questionnaire responses. It sheds light on the vehicle ownership determinants of the respondents that currently own vehicles as well as on respondents’ future vehicle purchase plans. This research perspective reveals the importance of vehicle price and seating capacity and points out that seating capacity is not necessarily defined by the household size in Saudi Arabia. As HEV is not yet available in the Saudi market, the next perspective applies the Theory of Reasoned Action (TRA) by analyzing 847 questionnaire responses to identify factors that might drive Saudis’ intention to adopt such technology. The results indicate that, while both subjective norm and attitude are significant in explaining the intention, subjective norm has three times stronger effect on adopting HEV than attitude. The last perspective contains a three-stage analysis to help identify the profiles of the most potential HEV early adopters and increase the chance for the relevant stakeholders to reach out to an effective range of consumers. Three characteristics of such adopters are identified: at least 35 years old, part of a larger household (more than 6 people), and owning more than one vehicle. "
APA, Harvard, Vancouver, ISO, and other styles
20

Annakula, Chandravyas. "Hierarchical and partitioning based hybridized blocking model." Kansas State University, 2017. http://hdl.handle.net/2097/35468.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
William H. Hsu
(Higgins, Savje, & Sekhon, 2016) Provides us with a sampling blocking algorithm that enables large and complex experiments to run in polynomial time without sacrificing the precision of estimates on a covariate dataset. The goal of this project is to run the different clustering algorithms on top of clusters formed from above mentioned blocking algorithm and analyze the performance and compatibility of the clustering algorithms. We first start with applying the blocking algorithm on a covariate dataset and once the clusters are formed, we then apply our clustering algorithm HAC (Hierarchical Agglomerative Clustering) or PAM (Partitioning Around Medoids) on the seeds of the clusters. This will help us to generate more similar clusters. We compare our performance and precision of our hybridized clustering techniques with the pure clustering techniques to identify a suitable hybridized blocking model.
APA, Harvard, Vancouver, ISO, and other styles
21

Oztoprak, Kasim. "Hybrid Cdn P2p Architecture For Multimedia Streaming." Phd thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609400/index.pdf.

Full text
Abstract:
In this thesis, the problems caused by peer behavior in peer-to-peer (P2P) video streaming is investigated. First, peer behaviors are modeled using two dimensional continuous time markov chains to investigate the reliability of P2P video streaming systems. Then a metric is proposed to evaluate the dynamic behavior and evolution of P2P overlay network. Next, a hybrid geographical location-time and interest based clustering algorithm is proposed to improve the success ratio and reduce the delivery time of required content. Finally, Hybrid Fault Tolerant Video Streaming System (HFTS) over P2P networks has been designed and offered conforming the required Quality of Service (QoS) and Fault Tolerance. The results indicate that the required QoS can be achieved in streaming video applications using the proposed hybrid approach.
APA, Harvard, Vancouver, ISO, and other styles
22

Jaradat, Mohammad Abdel Kareem Rasheed. "A hybrid system for fault detection and sensor fusion based on fuzzy clustering and artificial immune systems." Texas A&M University, 2005. http://hdl.handle.net/1969.1/4780.

Full text
Abstract:
In this study, an efficient new hybrid approach for multiple sensors data fusion and fault detection is presented, addressing the problem with possible multiple faults, which is based on conventional fuzzy soft clustering and artificial immune system (AIS). The proposed hybrid system approach consists of three main phases. In the first phase signal separation is performed using the Fuzzy C-Means (FCM) algorithm. Subsequently a single (fused) signal based on the information provided from the sensor signals is generated by the fusion engine. The information provided from the previous two phases is used for fault detection in the third phase based on the Artificial Immune System (AIS) negative selection mechanism. The simulations and experiments for multiple sensor systems have confirmed the strength of the new approach for online fusing and fault detection. The hybrid system gives a fault tolerance by handling different problems such as noisy sensor signals and multiple faulty sensors. This makes the new hybrid approach attractive for solving such fusion problems and fault detection during real time operations. This hybrid system is extended for early fault detection in complex mechanical systems based on a set of extracted features; these features characterize the collected sensors data. The hybrid system is able to detect the onset of fault conditions which can lead to critical damage or failure. This early detection of failure signs can provide more effective information for any maintenance actions or corrective procedure decisions.
APA, Harvard, Vancouver, ISO, and other styles
23

Tejaswi, Nunna. "Performance Analysis on Hybrid and ExactMethods for Solving Clustered VRP : A Comparative Study on VRP Algorithms." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14112.

Full text
Abstract:
Context: The Vehicle Routing Problem is an NP-hard problem with a combination of varieties oftopics like logistics, optimization research and data mining. There is a vast need of vehicle routingsolutions in day to day like with different constraints. According to the requirements, this problem hasbeen a field of interest to a lot of researchers who incorporate scientific methods to combine andinnovate new solutions to optimize the routing. Being an np-hard problem, it is almost impossible tocompute the solutions to optimality but years of research on this area has paid off quite significantlyand the solutions are optimized little by little and better than before. Some applications may or maynot find slight difference in the performance as a considerable affect but some applications orscenarios heavily depend on the performance of the solution where it is very vital that the solution isoptimized to the fullest. As a data mining technique clustering has been used very prominently in caseof portioning scenarios and similarly it has also began to surface in implementing VRP solutions.Although it has recently emerged into the Vehicle Routing era and shown some significant results, ithas not yet come into an open state or awareness. The awareness regarding clustering matters in ahuge extent to be considered by most of the recent researchers who formulate new algorithms to solveVRP and help them further optimize their solution. Objectives: In this study the significance of clustering has been considered to find out how the usageof clustering techniques can alter the performance of VRP based solutions favorably. Then to test theresults of two recently proposed cluster based algorithms, a comparison has been made to other typesof algorithms which prove how the algorithms stand with various methods. Methods: A literature review is performed using various articles that have been gathered from GoogleScholar and then an empirical experiment was conducted on the results available in the papers. Thisexperiment was done by performing a comparative analysis. Results: For the literature review the results were gathered from all the articles based on theirresearch, experience, use of clustering and how their result was improved by using clustering methodsin their formulations. Considering the experiment, the results of both the algorithm were comparedwith the results of five other papers who aim to solve the VRP using exactly the same instances thatwere used in the two algorithms in order to compare valid results on the same variables. Then theresults were analyzed for the purpose of comparison and conclusions were drawn accordingly. Conclusions: From the research performed in this paper we can conclude the vast significance ofclustering techniques that were drawn based on practical test results of various authors. From theexperiment performed it is clear that the Hybrid algorithm has a much higher performance than anyother algorithm it has been compared to. This algorithm has also been proven to enhance itsperformance due to the implementation of clustering techniques in their formulation. Since the resultswere only based on performance that is, in this case the total distance of the final route, future studyindicates the implementation of algorithms to compare them on basis of time complexity and spacecomplexity as well.
APA, Harvard, Vancouver, ISO, and other styles
24

Naldi, Murilo Coelho. "Agrupamento híbrido de dados utilizando algoritmos genéticos." Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-07112006-080351/.

Full text
Abstract:
Técnicas de Agrupamento vêm obtendo bons resultados quando utilizados em diversos problemas de análise de dados, como, por exemplo, a análise de dados de expressão gênica. Porém, uma mesma técnica de agrupamento utilizada em um mesmo conjunto de dados pode resultar em diferentes formas de agrupar esses dados, devido aos possíveis agrupamentos iniciais ou à utilização de diferentes valores para seus parâmetros livres. Assim, a obtenção de um bom agrupamento pode ser visto como um processo de otimização. Esse processo procura escolher bons agrupamentos iniciais e encontrar o melhor conjunto de valores para os parâmetros livres. Por serem métodos de busca global, Algoritmos Genéticos podem ser utilizados durante esse processo de otimização. O objetivo desse projeto de pesquisa é investigar a utilização de Técnicas de Agrupamento em conjunto com Algoritmos Genéticos para aprimorar a qualidade dos grupos encontrados por algoritmos de agrupamento, principalmente o k-médias. Esta investigação será realizada utilizando como aplicação a análise de dados de expressão gênica. Essa dissertação de mestrado apresenta uma revisão bibliográfica sobre os temas abordados no projeto, a descrição da metodologia utilizada, seu desenvolvimento e uma análise dos resultados obtidos.
Clustering techniques have been obtaining good results when used in several data analysis problems, like, for example, gene expression data analysis. However, the same clustering technique used for the same data set can result in different ways of clustering the data, due to the possible initial clustering or the use of different values for the free parameters. Thus, the obtainment of a good clustering can be seen as an optimization process. This process tries to obtain good clustering by selecting the best values for the free parameters. For being global search methods, Genetic Algorithms have been successfully used during the optimization process. The goal of this research project is to investigate the use of clustering techniques together with Genetic Algorithms to improve the quality of the clusters found by clustering algorithms, mainly the k-means. This investigation was carried out using as application the analysis of gene expression data, a Bioinformatics problem. This dissertation presents a bibliographic review of the issues covered in the project, the description of the methodology followed, its development and an analysis of the results obtained.
APA, Harvard, Vancouver, ISO, and other styles
25

Jaafar, Amine. "Traitement de la mission et des variables environnementales et intégration au processus de conception systémique." Thesis, Toulouse, INPT, 2011. http://www.theses.fr/2011INPT0070/document.

Full text
Abstract:
Ce travail présente une démarche méthodologique visant le «traitement de profils» de «mission» et plus généralement de «variables environnementales» (mission, gisement, conditions aux limites), démarche constituant la phase amont essentielle d’un processus de conception systémique. La «classification» et la «synthèse» des profils relatifs aux variables d’environnement du système constituent en effet une première étape inévitable permettant de garantir, dans une large mesure, la qualité du dispositif conçu et ce à condition de se baser sur des «indicateurs» pertinents au sens des critères et contraintes de conception. Cette approche s’inscrit donc comme un outil d’aide à la décision dans un contexte de conception systémique. Nous mettons en particulier l’accent dans cette thèse sur l’apport de notre approche dans le contexte de la conception par optimisation qui, nécessitant un grand nombre d’itérations (évaluation de solutions de conception), exige l’utilisation de «profils compacts» au niveau informationnel (temps, fréquence,…). Nous proposons dans une première phase d’étude, une démarche de «classification» et de «segmentation» des profils basée sur des critères de partitionnement. Cette étape permet de guider le concepteur vers le choix du nombre de dispositifs à concevoir pour sectionner les produits créés dans une gamme. Dans une deuxième phase d’étude, nous proposons un processus de «synthèse de profil compact», représentatif des données relatives aux variables environnementales étudiées et dont les indicateurs de caractérisation correspondent aux caractéristiques de référence des données réelles. Ce signal de durée réduite est obtenu par la résolution d’un problème inverse à l’aide d’un algorithme évolutionnaire en agrégeant des motifs élémentaires paramétrés (sinusoïde, segments, sinus cardinaux). Ce processus de «synthèse compacte» est appliqué ensuite sur des exemples de profils de missions ferroviaires puis sur des gisements éoliens (vitesse du vent) associés à la conception de chaînes éoliennes. Nous prouvons enfin que la démarche de synthèse de profil représentatif et compact accroît notablement l’efficacité de l’optimisation en minimisant le coût de calcul facilitant dès lors une approche de conception par optimisation
This work presents a methodological approach aiming at analyzing and processing mission profiles and more generally environmental variables (e.g. solar or wind energy potential, temperature, boundary conditions) in the context of system design. This process constitutes a key issue in order to ensure system effectiveness with regards to design constraints and objectives. In this thesis, we pay a particular attention on the use of compact profiles for environmental variables in the frame of system level integrated optimal design, which requires a wide number of system simulations. In a first part, we propose a clustering approach based on partition criteria with the aim of analyzing mission profiles. This phase can help designers to identify different system configurations in compliance with the corresponding clusters: it may guide suppliers towards “market segmentation” not only fulfilling economic constraints but also technical design objectives. The second stage of the study proposes a synthesis process of a compact profile which represents the corresponding data of the studied environmental variable. This compact profile is generated by combining parameters and number of elementary patterns (segment, sine or cardinal sine) with regards to design indicators. These latter are established with respect to the main objectives and constraints associated to the designed system. All pattern parameters are obtained by solving the corresponding inverse problem with evolutionary algorithms. Finally, this synthesis process is applied to two different case studies. The first consists in the simplification of wind data issued from measurements in two geographic sites of Guadeloupe and Tunisia. The second case deals with the reduction of a set of railway mission profiles relative to a hybrid locomotive devoted to shunting and switching missions. It is shown from those examples that our approach leads to a wide reduction of the profiles associated with environmental variables which allows a significant decrease of the computational time in the context of an integrated optimal design process
APA, Harvard, Vancouver, ISO, and other styles
26

SOUZA, Leandro Carlos de. "Agrupamento e regressão linear de dados simbólicos intervalares baseados em novas representações." Universidade Federal de Pernambuco, 2016. https://repositorio.ufpe.br/handle/123456789/17640.

Full text
Abstract:
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-08-08T12:52:58Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) teseCinLeandro.pdf: 1316077 bytes, checksum: 61e762c7526a38a80ecab8f5c7769a47 (MD5)
Made available in DSpace on 2016-08-08T12:52:58Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) teseCinLeandro.pdf: 1316077 bytes, checksum: 61e762c7526a38a80ecab8f5c7769a47 (MD5) Previous issue date: 2016-01-18
Um intervalo é um tipo de dado complexo usado na agregação de informações ou na representação de dados imprecisos. Este trabalho apresenta duas novas representações para intervalos com o objetivo de se construir novos métodos de agrupamento e regressão linear para este tipo de dado. O agrupamento por nuvens dinâmicas define partições nos dados e associa protótipos a cada uma destas partições. Os protótipos resumem a informação das partições e são usados na minimização de um critério que depende de uma distância, responsável por quantificar a proximidade entre instâncias e protótipos. Neste sentido, propõe-se a formulação de uma nova distância híbrida entre intervalos baseando-se em distâncias para pontos. Os pontos utilizados são obtidos dos intervalos através de um mapeamento. Também são propostas duas versões com pesos para a distância criada: uma com pesos no hibridismo e outra com pesos adaptativos. Na regressão linear, propõe-se a representação dos intervalos através da equação paramétrica da reta. Esta parametrização permite o ajuste dos pontos nas variáveis regressoras que dão as melhores estimativas para os limites da variável resposta. Antes da realização da regressão, um critério é calculado para a verificação da coerência matemática da predição, na qual o limite superior deve ser maior ou igual ao inferior. Se o critério mostra que a coerência não é garantida, propõe-se a aplicação de uma transformação sobre a variável resposta. Assim, este trabalho também propõe algumas transformações que podem ser aplicadas a dados intervalares, no contexto de regressão. Dados sintéticos e reais são utilizados para comparar os métodos provenientes das representações propostas e aqueles presentes na literatura.
An interval is a complex data type used in the information aggregation or in the representation of imprecise data. This work presents two new representations of intervals in order to construct a new cluster method and a new linear regression method for this kind of data. Dynamic clustering defines partitions into the data and it defines prototypes associated with each one of these partitions. The prototypes summarize the information about the partitions and they are used in a minimization criterion which depends on a distance, which is responsible for quantifying the proximity between instances and prototypes. In this way, it is proposed a new hybrid distance between intervals based on a family of distances between points. Points are obtained from the interval through a mapping. Also, it is proposed two versions of the hybrid distance, both with weights: one with weights in hybridism and other with adaptive weights. In linear regression, it is proposed to represent the intervals through the parametric equation of the line. This parametrization allows to find the set of points in the regression variables corresponding to the best estimates for the response variable limits. Before the regression construction, a criterion is computed to verify the mathematical consistency of prediction, where the upper limit must be greater than or equal to the lower. If the test shows that consistency is not guaranteed, then the application proposes a transformation of the response variable. Therefore, this work also proposes some transformations that can be applied to interval data in the regression context. Synthetic and real data are used to compare the proposed methods and those one proposed on literature.
APA, Harvard, Vancouver, ISO, and other styles
27

Barak, Sasan. "Technical and Fundamental Features’ analysis for Stock Market Prediction with Data Mining Methods." Doctoral thesis, Università degli studi di Bergamo, 2019. http://hdl.handle.net/10446/128764.

Full text
Abstract:
Of the most important concerns of market practitioners is future information of the companies which offer stocks. A reliable prediction of the company’s financial status provides a situation for the investor to more confident investments and gaining more profits(Huang, 2012b). Accurately prediction of stocks’ prices has a positive affects into the organizations financial stability (Asadi et al., 2012). Since financial market is complex and has non-linear dynamic systems, its prediction is really challenging (Huang and Tsai, 2009). The steady and amazing progress of computer hardware technology in the past decades has led to large supplies of powerful and affordable computers, data collection equipment, and storage media. This technology provides a great boost to the database and information industry and makes a huge number of databases and information repositories available for transaction management, information retrieval, and data analysis. Data mining are defined as group of algorithms and methods designed to analyze data or to extract patterns in specific categories from data contributing greatly to business strategies, engineering, medical research, and financial areas (Klosgen and Zytkow, 1996). Prediction of stock prices, credit scores, and even bankruptcy potentials are examples of significant applicability of data mining in the field of finance. In this research we are using the potential tools of data mining area for the forecasting the stocks and shares’ prices and future trends. However, there are different approaches in financial forecasting in general and stock market price forecasting in particular including using fundamental analysis, technical analysis, and news via econometric or machine learning algorithms (Atsalakis et al., 2011; Kar et al., 2014), while in the this thesis we will go through all of these methodologies. The structure of the thesis is consist of three papers of the author, published in the ISI journals about using technical and fundamental features for stock market prediction with different algorithms in the data mining as chapter 3 until chapter 5. The thesis exploits different types of financial data set and established three aspects of stock market forecasting via different combination of feature engineering in the finance dataset and machine learning models.
APA, Harvard, Vancouver, ISO, and other styles
28

Luo, Hongwei, and Hongwei luo@rmit edu au. "Modelling and simulation of large-scale complex networks." RMIT University. Mathematical and Geospatial Sciences, 2007. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080506.142224.

Full text
Abstract:
Real-world large-scale complex networks such as the Internet, social networks and biological networks have increasingly attracted the interest of researchers from many areas. Accurate modelling of the statistical regularities of these large-scale networks is critical to understand their global evolving structures and local dynamical patterns. Traditionally, the Erdos and Renyi random graph model has helped the investigation of various homogeneous networks. During the past decade, a special computational methodology has emerged to study complex networks, the outcome of which is identified by two models: the Watts and Strogatz small-world model and the Barabasi-Albert scale-free model. At the core of the complex network modelling process is the extraction of characteristics of real-world networks. I have developed computer simulation algorithms for study of the properties of current theoretical models as well as for the measurement of two real-world complex networks, which lead to the isolation of three complex network modelling essentials. The main contribution of the thesis is the introduction and study of a new General Two-Stage growth model (GTS Model), which aims to describe and analyze many common-featured real-world complex networks. The tools we use to create the model and later perform many measurements on it consist of computer simulations, numerical analysis and mathematical derivations. In particular, two major cases of this GTS model have been studied. One is named the U-P model, which employs a new functional form of the network growth rule: a linear combination of preferential attachment and uniform attachment. The degree distribution of the model is first studied by computer simulation, while the exact solution is also obtained analytically. Two other important properties of complex networks: the characteristic path length and the clustering coefficient are also extensively investigated, obtaining either analytically derived solutions or numerical results by computer simulations. Furthermore, I demonstrate that the hub-hub interaction behaves in effect as the link between a network's topology and resilience property. The other is called the Hybrid model, which incorporates two stages of growth and studies the transition behaviour between the Erdos and Renyi random graph model and the Barabasi-Albert scale-free model. The Hybrid model is measured by extensive numerical simulations focusing on its degree distribution, characteristic path length and clustering coefficient. Although either of the two cases serves as a new approach to modelling real-world large-scale complex networks, perhaps more importantly, the general two-stage model provides a new theoretical framework for complex network modelling, which can be extended in many ways besides the two studied in this thesis.
APA, Harvard, Vancouver, ISO, and other styles
29

Rahmani, Hoda. "Traveling Salesman Problem with Single Truck and Multiple Drones for Delivery Purposes." Ohio University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1563894245160348.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Ouali, Abdelkader. "Méthodes hybrides parallèles pour la résolution de problèmes d'optimisation combinatoire : application au clustering sous contraintes." Thesis, Normandie, 2017. http://www.theses.fr/2017NORMC215/document.

Full text
Abstract:
Les problèmes d’optimisation combinatoire sont devenus la cible de nombreuses recherches scientifiques pour leur importance dans la résolution de problèmes académiques et de problèmes réels rencontrés dans le domaine de l’ingénierie et dans l’industrie. La résolution de ces problèmes par des méthodes exactes ne peut être envisagée à cause des délais de traitement souvent exorbitants que nécessiteraient ces méthodes pour atteindre la (les) solution(s) optimale(s). Dans cette thèse, nous nous sommes intéressés au contexte algorithmique de résolution des problèmes combinatoires, et au contexte de modélisation de ces problèmes. Au niveau algorithmique, nous avons appréhendé les méthodes hybrides qui excellent par leur capacité à faire coopérer les méthodes exactes et les méthodes approchées afin de produire rapidement des solutions. Au niveau modélisation, nous avons travaillé sur la spécification et la résolution exacte des problématiques complexes de fouille des ensembles de motifs en étudiant tout particulièrement le passage à l’échelle sur des bases de données de grande taille. D'une part, nous avons proposé une première parallélisation de l'algorithme DGVNS, appelée CPDGVNS, qui explore en parallèle les différents clusters fournis par la décomposition arborescente en partageant la meilleure solution trouvée sur un modèle maître-travailleur. Deux autres stratégies, appelées RADGVNS et RSDGVNS, ont été proposées qui améliorent la fréquence d'échange des solutions intermédiaires entre les différents processus. Les expérimentations effectuées sur des problèmes combinatoires difficiles montrent l'adéquation et l'efficacité de nos méthodes parallèles. D'autre part, nous avons proposé une approche hybride combinant à la fois les techniques de programmation linéaire en nombres entiers (PLNE) et la fouille de motifs. Notre approche est complète et tire profit du cadre général de la PLNE (en procurant un haut niveau de flexibilité et d’expressivité) et des heuristiques spécialisées pour l’exploration et l’extraction de données (pour améliorer les temps de calcul). Outre le cadre général de l’extraction des ensembles de motifs, nous avons étudié plus particulièrement deux problèmes : le clustering conceptuel et le problème de tuilage (tiling). Les expérimentations menées ont montré l’apport de notre proposition par rapport aux approches à base de contraintes et aux heuristiques spécialisées
Combinatorial optimization problems have become the target of many scientific researches for their importance in solving academic problems and real problems encountered in the field of engineering and industry. Solving these problems by exact methods is often intractable because of the exorbitant time processing that these methods would require to reach the optimal solution(s). In this thesis, we were interested in the algorithmic context of solving combinatorial problems, and the modeling context of these problems. At the algorithmic level, we have explored the hybrid methods which excel in their ability to cooperate exact methods and approximate methods in order to produce rapidly solutions of best quality. At the modeling level, we worked on the specification and the exact resolution of complex problems in pattern set mining, in particular, by studying scaling issues in large databases. On the one hand, we proposed a first parallelization of the DGVNS algorithm, called CPDGVNS, which explores in parallel the different clusters of the tree decomposition by sharing the best overall solution on a master-worker model. Two other strategies, called RADGVNS and RSDGVNS, have been proposed which improve the frequency of exchanging intermediate solutions between the different processes. Experiments carried out on difficult combinatorial problems show the effectiveness of our parallel methods. On the other hand, we proposed a hybrid approach combining techniques of both Integer Linear Programming (ILP) and pattern mining. Our approach is comprehensive and takes advantage of the general ILP framework (by providing a high level of flexibility and expressiveness) and specialized heuristics for data mining (to improve computing time). In addition to the general framework for the pattern set mining, two problems were studied: conceptual clustering and the tiling problem. The experiments carried out showed the contribution of our proposition in relation to constraint-based approaches and specialized heuristics
APA, Harvard, Vancouver, ISO, and other styles
31

Grozavu, Nistor. "Classification topologique pondérée : approches modulaires, hybrides et collaboratives." Paris 13, 2009. http://www.theses.fr/2009PA132022.

Full text
Abstract:
Cette thèse est consacrée d'une part, à l'étude d'approches de caractérisation des classes découvertes pendant l'apprentissage non-supervisé, et d'autre part, à la classification non-supervisée modulaire, hybride et collaborative. L'étude se focalise essentiellement sur deux axes : - la caractérisation des classes en utilisant la pondération et la sélection des variables pertinentes, ainsi que l'utilisation de la notion de mémoire pendant le processus d'apprentissage topologique non-supervisé; - l'utilisation de plusieurs techniques de clustering en parallèle et en série : approches modulaires, hybrides et collaboratives. Nous nous intéressons plus particulièrement dans cette thèse aux cartes auto-organisatrices de Kohonen qui constituent une technique bien adaptée à la classification non-supervisée permettant une visualisation des résultats sous forme d'une carte topographique. Nous proposons plusieurs techniques de pondérations de l'apprentissage de ces cartes ainsi qu'une nouvelle stratégie de compétition permettant de garder en mémoire l'historique de l'apprentissage. En utilisant un test statistique pour la sélection des variables pertinentes pondérées, nous répondons au problème de la réduction des dimensions, ainsi qu'au problème de la caractérisation des classes découvertes. Concernant le deuxième axe, nous utilisons le formalisme mathématique de l'analyse relationnelle (AR) pour combiner plusieurs résultats de classification. Enfin, nous proposons une nouvelle approche conçue pour faire collaborer plusieurs classifications topographiques entre elles ,en préservant la confidentialité des données
This thesis is focused, on the one hand, to study clustering anlaysis approaches in an unsupervised topological learning, and in other hand, to the topological modular, hybrid and collaborative clustering. This study is adressed mainly on two problems: - cluster characterization using weighting and selection of relevant variables, and the use of the memory concept during the learning unsupervised topological process; - and the problem of the ensemble clustering techniques : the modularization, the hybridization and collaboration. We are particularly interested in this thesis in Kohonen's self-organizing maps which have been widely used for unsupervised classification and visualization of multidimensional datasets. We offer several weighting approaches and a new strategy which consists in the introduction of a memory process into the competition phase by calculating a voting matrix at each learning iteration. Using a statistical test for selecting relevant variables, we will respond to the problem of dimensionality reduction, and to the problem of the cluster characterization. For the second problem, we use the relational analysis approach (RA) to combine multiple topological clustering results
APA, Harvard, Vancouver, ISO, and other styles
32

Savoca, Marco [Verfasser], Otto [Akademischer Betreuer] Dopfer, Otto [Gutachter] Dopfer, and Gereon [Gutachter] Nidner-Schatteburg. "Spektroskopie an silizium- und kohlenstoffhaltigen Clustern: Dotierung und Hybride / Marco Savoca ; Gutachter: Otto Dopfer, Gereon Nidner-Schatteburg ; Betreuer: Otto Dopfer." Berlin : Technische Universität Berlin, 2018. http://d-nb.info/1156274737/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Jemili, Imen. "Clusterisation et conservation d’énergie dans les réseaux ad hoc hybrides à grande échelle." Thesis, Bordeaux 1, 2009. http://www.theses.fr/2009BOR13818/document.

Full text
Abstract:
Dans le cadre des réseaux ad hoc à grande envergure, le concept de clusterisation peut être mis à profit afin de faire face aux problèmes de passage à l'échelle et d'accroître les performances du système. Tout d’abord, cette thèse présente notre algorithme de clusterisation TBCA ‘Tiered based Clustering algorithm’, ayant pour objectif d’organiser le processus de clusterisation en couches et de réduire au maximum le trafic de contrôle associé à la phase d’établissement et de maintenance de l’infrastructure virtuelle générée. La formation et la maintenance d’une infrastructure virtuelle ne sont pas une fin en soi. Dans cet axe, on a exploité les apports de notre mécanisme de clusterisation conjointement avec le mode veille, à travers la proposition de l’approche de conservation d’énergie baptisée CPPCM ‘Cluster based Prioritized Power Conservation Mechanism’ avec deux variantes. Notre objectif principal est de réduire la consommation d’énergie tout en assurant l’acheminement des paquets de données sans endurer des temps d’attente importants aux niveaux des files d’attente des nœuds impliqués dans le transfert. Nous avons proposé aussi un algorithme de routage LCR ‘Layered Cluster based Routing’ se basant sur l’existence d’une infrastructure virtuelle. L’exploitation des apports de notre mécanisme TBCA et la limitation des tâches de routage additionnelles à un sous ensemble de nœuds sont des atouts pour assurer le passage à l’échelle de notre algorithme
Relying on a virtual infrastructure seems a promising approach to overcome the scalability problem in large scale ad hoc networks. First, we propose a clustering mechanism, TBCA ‘Tiered based Clustering algorithm’, operating in a layered manner and exploiting the eventual collision to accelerate the clustering process. Our mechanism does not necessitate any type of neighbourhood knowledge, trying to alleviate the network from some control messages exchanged during the clustering and maintenance process. Since the energy consumption is still a critical issue, we combining a clustering technique and the power saving mode in order to conserve energy without affecting network performance. The main contribution of our power saving approach lies on the differentiation among packets based on the amount of network resources they have been so far consumed. Besides, the proposed structure of the beacon interval can be adjusted dynamically and locally by each node according to its own specific requirements. We propose also a routing algorithm, LCR ‘Layered Cluster based Routing’. The basic idea consists on assigning additional tasks to a limited set of dominating nodes, satisfying specific requirements while exploiting the benefits of our clustering algorithm TBCA
APA, Harvard, Vancouver, ISO, and other styles
34

Liu, YiChun, and 劉逸群. "A Hybrid Approach to Clustering Algorithms." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/39232384482741892012.

Full text
Abstract:
碩士
國立中央大學
資訊工程研究所
90
Clustering algorithms are effective tools for exploring the structures of complex data sets, therefore, are of great value in a number of applications. For most of clustering algorithms, two crucial problems required to be solved are (1) the determining of the optimal number of clusters (2) the determining of the similarity measure based on which patterns are assigned to corresponding clusters. The estimation of the number of clusters in the data set is the so-called cluster validity problem. Conventional approaches to solving the cluster validity problem usually involves increasing the number of clusters, and/or merging the existing clusters, computing some certain cluster validity measures in each run, until partition into optimal number of clusters is obtained. Since most validity measures usually assume a certain geometrical structure in cluster shapes, these approaches fail to estimate the correct number of clusters in real data with a large variety of distributions within and between clusters. The second crucial problem faces a similar situation. While it is easy to consider the idea of a data cluster on a rather informal basis, it is very difficult to give a formal and universal definition of a cluster. Most of the conventional clustering methods assume that patterns having similar locations or constant density create a single cluster. In order to mathematically identify clusters in a data set, it is usually necessary to first define a measure of similarity or proximity which will establish a rule for assigning patterns to the domain of a particular cluster center. As it is to be expected, the measure of similarity is problem dependent. That is, different similarity measures will result in different clustering results. In this paper, we propose a hierarchical approach to ART-like clustering algorithm which is able to deal with data consisting of arbitrarily geometrical-shaped clusters. Combining hierarchical and ART-like clustering is suggested as a natural feasible solution to the two problems of determining the number of clusters and clustering data.
APA, Harvard, Vancouver, ISO, and other styles
35

Cheng, Yi-Shan, and 鄭伊珊. "A Modified Hybrid Recommendation Mechanism using clustering concept." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/dm7u39.

Full text
Abstract:
碩士
中原大學
資訊管理研究所
98
In the era of information explosion, a lot of information surrounds our daily life. It is important that helping people to filter out unnecessary data can improve their performance on obtaining appropriate information. Therefore, this study adopts some user profile information to construct user preference model. This research also develops a classified method and a simulated tool to recommend items and contents for users. Firstly, the proposed method uses k-means clustering method to group users according to their personal attributes. Secondly, we use neural networks to simulate user’s preference. On the other hand, fuzzy method considers the preferences of users to recommend items by searching through neighborhood. Finally, this system combines k-means clustering, neural networks, and fuzzy methods to recommended items for users. To resolve the new user problem of traditional recommendation methods, the proposed method uses the rating results of existing neighbors in the same cluster to construct the preference network of new users to predict user’s rating results. Comparing the experimental results obtained from neural networks, decision tree, and association rules, the proposed method can achieve better prediction accuracy and increase the quality of recommendation results.
APA, Harvard, Vancouver, ISO, and other styles
36

Hsu, Wu-hsien, and 許武先. "Conjecturable Rules Discovery by Clustering-Classification Hybrid Approach." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/65946782992000654374.

Full text
Abstract:
博士
國立中央大學
資訊管理研究所
99
Discovering hidden or unknown knowledge is the major theme of most data mining studies. In this dissertation, we propose a new approach to discover conjecturable rules, which categorize observations of a data set into classes of similar attribute values instead of classes of crisp labels. The proposed approach is developed based on the two most developed data mining techniques: Classification and Clustering. Classification is the problem of identifying the sub-population to which new observations belong. The result is decided according to a set of rules which discovered from a training set of data of observations whose sub-population is known. The technique is known as supervised learning, i.e. pre-defined labels are necessary for the process. The result is a set of rules which are able to predict which label a new observation is belonged to. However, when there is no label existed in the dataset, this technique fails to apply. On the other hand, Clustering is the process of grouping a set of objects into classes of similar objects. No pre-defined label is necessary for the process. It is known as unsupervised learning. Yet no any rule is preserved after the process for future prediction. The object of this dissertation is to discover conjecturable rules from those datasets which do not have any predefined class label. Furthermore, the technique extends our two previous studies with fuzzy concept and outliers handling. Thus recessive conjecturable rules can be discovered as well as the accuracy is improved. The proposed technique covers the convenience of unsupervised learning as well as the ability of prediction of decision trees. The experiment results show that our proposed approach is capable to discover conjecturable rules as well as recessive rules. Sensitivity analysis is also given for practitioners’ reference.
APA, Harvard, Vancouver, ISO, and other styles
37

CHEN, PEI-YIN, and 陳姵穎. "A Hybrid Autoencoder Networks for Unsupervised Image Clustering." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/76trm9.

Full text
Abstract:
碩士
東吳大學
資訊管理學系
107
Image clustering involves the process of mapping an archive image into a cluster such that the set of clusters has the same information. It is an important field of machine learning and computer vision. Although traditional clustering methods, such as k-means or the agglomerative clustering method, have been widely used for the task of clustering, it is difficult for them to handle image data due to having no predefined distance metrics and high dimensionality. Recently, deep unsupervised feature learning methods, such as the autoencoder (AE), have been employed for image clustering with great success. However, each model has its specialty and advantages for image clustering. Hence, we combine three AE-based models—the convolutional autoencoder (CAE), adversarial autoencoder (AAE), and stacked autoencoder (SAE)—to form a hybrid autoencoder (BAE) model for image clustering. The MNIST and CIFAR-10 datasets are used to test the result of the proposed models and compare the results with others. The results of the clustering criteria indicate that the proposed models outperform others in the numerical experiment.
APA, Harvard, Vancouver, ISO, and other styles
38

Chen, Mei-Chen, and 陳美蓁. "Sensing Data Clustering for Hybrid Cellular-Vehicular Networks." Thesis, 2019. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107NCHU5394077%22.&searchmode=basic.

Full text
Abstract:
碩士
國立中興大學
資訊科學與工程學系所
107
With the development of the self-driving field, the research of the Vehicular Network is important. Assume that every vehicle is controlled by the base station, so the information of vehicles need to be uploaded to the base station. Vehicles are equipped with sensors to detect neighboring nodes. If every vehicle uploads its own and neighboring nodes information, it will cause too much redundant information. Use clustering (clustering) techniques to reduce the overhead of vehicular network. In the past research, the selected leadership(CH, Cluster Head) used DSRC to achieve cluster formation. However, it will result in too many cluster management message packets.In order to solve this problem, our proposed cluster algorithm is that the header (CH) can assemble and upload the information of the neighboring nodes based on sensors. It can reduce redundant information and reduces the overhead of network.In the part of the cluster header election (CH Election), the greedy structure is used to improve. First, check the line of sight of vehicles and different road segments, and then calculate the one-hop neighboring degree of each vehicle. The vehicle with the highest degree of connection become CH. We add a comparison mechanism. If greater than two nodes have the maximum connection degree, the number of the two-hop neighboring vehicles will be considered. We select a small amount of two-hop nodes. It aims to reduce isolated node generation. And we also propose a method for caravans and a reliable mechanism which ensure every nodes can be covered by two headers.The results show that our method can effectively reduce the amount of redundant information and the overhead of network.
APA, Harvard, Vancouver, ISO, and other styles
39

Chen, Chien-Chih, and 陳建志. "Automatic Clustering for Cell Formation Using Hybrid Evolutionary Algorithms." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/c9r9zk.

Full text
Abstract:
博士
大同大學
資訊工程學系(所)
102
To design an efficient cellular manufacturing system, the first fundamental task is to form machine cells and part families, called cell formation (CF). CF problems can be classified into two categories: standard CF and generalized CF (GCF). In standard CF each part has a single process routing, while in GCF each part has more than one process routing. One of the drawbacks of existing CF approaches is that the number of part families or machine cells has to be specified in advance. In practice, it is difficult for the machine cell designer to determine the optimal number of machine cells before the overall machine cell configuration is formed and the operational result is observed. In this dissertation, the author developed two hybrid evolutionary algorithms which can perform automatic clustering to solve standard CF and GCF problems, respectively. Experimental results indicate that effective hybrid optimization algorithms can achieve fast convergence and find global optimum more easily than an individual optimization algorithm. To solve standard CF problems, an automatic fuzzy clustering approach is proposed, in which a differential evolution (DE) algorithm is combined with the Fuzzy c-means (FCM) method. This CF algorithm can automatically determine the best number of machine cells and generate an optimal machine cell configuration at the same time. Experimental results demonstrate that the proposed algorithm performs well in searching solutions to the fuzzy machine CF problem with automatic cluster number determination. To solve GCF problems, the second automatic clustering approach can concurrently evolve the number and cluster centers of machine cells by using two particle swarm optimization (PSO) algorithms. In this approach, a solution representation, comprising an integer number and a set of real numbers, is adopted to encode the number of cells and machine cluster centers, respectively. Besides, a discrete PSO algorithm is utilized to search for the number of machine cells, and a continuous PSO algorithm is employed to perform machine clustering. Effectiveness of the proposed approach has been demonstrated for test problems selected from the literature and those generated in this study. The experimental results indicate that the proposed approach is capable of solving the generalized machine CF problem without predetermination of the number of cells.
APA, Harvard, Vancouver, ISO, and other styles
40

Chen, Ko-ning, and 陳克寧. "A Study of Clustering Approaches Using Hybrid Neural Networks." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/34645542198203170776.

Full text
Abstract:
碩士
南華大學
資訊管理學研究所
91
This thesis proposes two hybrid clustering approaches using neural networks. These clustering approaches use a Self-Organizing Map (SOM) to preprocess data, and apply some traditional clustering methods (e.g. Fuzzy Min-Max Neural Networks [28]) to data mining. Finally, we use some bench-mark exemplification for testing our approaches.     The first approach uses the property of data topology preservation of SOM to generate protoclusters [21] (or quantization information) in the first step. Subsequently, quantization information will be clustered again by our improved Contiguity-Constrained Clustering Method that can obtain a minimum global variance when some closer protoclusters are merged.     The second approach uses SOM to preprocess data and the result is applied to a Fuzzy Min-Max Clustering Neural Network (FMM) [28] for clustering. Such an approach can, therefore, be a solution for the sensitivity problem; that is, different input sequences of the same data set to Fuzzy Min-Max Clustering Algorithm may give different Hyperbox results.
APA, Harvard, Vancouver, ISO, and other styles
41

DLAMIN, THEMBELIHLE, and 狄天柏. "Clustering and Resource Allocation Schemes for Hybrid Femtocell Networks." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/27207044722567235666.

Full text
Abstract:
碩士
國立交通大學
電機資訊國際學程
102
為了提升住宅和企業內部環境的服務範圍和服務品質,毫微微型細胞 (femtocells)已被視為一個解決方案,因為它可以提供低功率耗損且讓使用者自行佈署的特性。此外,毫微微型細胞可以被允許與巨細胞網路 (macro network)使用相同的載波頻率或是不同的載波頻率。在一個具有高緻密毫微微型細胞佈署的傳輸環境中,資源配置和干擾管理是一個重要的研究議題,其中干擾主要來自於使用不同的存取模式的毫微微型細胞。若毫微微型細胞運作在封閉存取模式 (closed access mode)指的是只允許擁有子載波使用權的使用者來和毫微微型細胞做連結;而在開放存取模式 (open access mode)指的是所有使用者皆可和毫微微型細胞來做連結。 為了獲得毫微微型細胞在企業內部環境建置的好處,混合式存取模式 (hybrid access mode)可以考慮被系統所採用,該模式可以同時服務封閉式用戶群組 (closed subscriber group)毫微微型細胞內的使用者和非封閉式用戶群組(Non-closed subscriber group)毫微微型細胞內的使用者。此外,當毫微微型細胞運作在混合式存取模式,可以提供封閉和非封閉式使用者間不同的服務層級。 在本論文中,我們考慮毫微微型細胞運作在混合式存取模式,且僅允許非封閉式使用者使用連結的毫微微型細胞的部分限制資源。為了最大化非封閉式用戶群的上鏈傳輸容量,本論文提出了一種集中式的功率配置方式,為非封閉式用戶群使用者進行資源的分配,其中使用了幾何規劃(geometric programming)和一種新穎的次佳化分群策略。此外,我們也考慮非封閉式使用者允入控制條件 (admission control condition) 的限制。本論文還提出一個在賽局理論架構下的分散式功率配置演算法。其中利用了非合作式的賽局(non-cooperative game)理論及其納什均衡 (Nash equilibrium)的收斂特性。本論文針對在非合作式的賽局中,證明純策略(pure strategy)納什均衡的存在。我們所設計的功率配置演算法主要是根據毫微微型細胞與其服務的用戶之間的距離分配上鏈的功率,以最大化效益函數(utility function)。分析結果顯示,我們提出的資源與分群演算法能夠有效地改善系統的整體效能。
APA, Harvard, Vancouver, ISO, and other styles
42

Chang, Hsi-mei, and 張喜媄. "Hybrid Algorithms of Finding Features for Clustering Sequential Data." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/17214461923175181494.

Full text
Abstract:
碩士
國立中山大學
資訊工程學系研究所
98
Proteins are the structural components of living cells and tissues, and thus an important building block in all living organisms. Patterns in proteins sequences are some subsequences which appear frequently. Patterns often denote important functional regions in proteins and can be used to characterize a protein family or discover the function of proteins. Moreover, it provides valuable information about the evolution of species. Grouping protein sequences that share similar structure helps in identifying sequences with similar functionality. Many algorithms have been proposed for clustering proteins according to their similarity, i.e., sequential patterns in protein databases, for example, feature-based clustering algorithms of the global approach and the local approach. They use the algorithm of mining sequential patterns to solve the no-gap-limit sequential pattern problem in a protein sequences database, and then find global features and local features separately for clustering. Feature-based clustering algorithms are entirely different approaches to protein clustering that do not require an all-against-all analysis and use a near-linear complexity K-means based clustering algorithm. Although feature-based clustering algorithms are scalable and lead to reasonably good clusters, they consume time on performing the global approach and the local approach separately. Therefore, in this thesis, we propose hybrid algorithms to find and mark features for feature-based clustering algorithms. We observe an interesting result from the relation between the local features and the closed frequent sequential patterns. The important observation which we find is that some features in the closed frequent sequential patterns can be taken apart to several features in the local selected features and the total support number of these features in the local selected features is equal to the support number of the corresponding feature in the closed frequent sequential patterns. There are two phases, find-feature and mark-feature, in the global approach and the local approach after mining sequential patterns. In our hybrid algorithms of Method 1 (LocalG), we first find and mark the local features. Then, we find the global features. Finally, we mark the bit vectors of the global features efficiently from the bit vector of the local features. In our hybrid algorithms of Method 2 (CLoseLG), we first find the closed frequent sequential patterns directly. Next, we find local candidate features efficiently from the closed frequent sequential patterns and then mark the local features. Finally, we find and mark the global features. From our performance study based on the biological data and the synthetic data, we show that our proposed hybrid algorithms are more efficient than the feature-based algorithm.
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, You-Yu, and 陳攸伃. "The Development of Hybrid Optimization Algorithm for Fuzzy Clustering." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/t937sf.

Full text
Abstract:
碩士
國立臺北科技大學
工業工程與管理研究所
97
The Fuzzy C-means Algorithm as proposed by Dunn (1974) is a commonly used fuzzy clustering method which conducts data clustering by randomly selecting initial centroids. With larger data size or attribute dimensions, clustering results may be affected and more repetitive computations are required. To compensate the effect of random initial centroids on results, this study proposed a hybrid optimization algorithm-Genetic Immune Fuzzy C-means Algorithm (GIFA). This algorithm first obtains the proper initial cluster centroids and then cluster data to improve clustering efficiency. And tests GIFA through three data sets: Teaching Assistant Evaluation, Ecoli and Class Identification, and compares the results with the executed results of Fuzzy C-means Algorithm (FCM), Genetic Fuzzy C-means Algorithm (GFA), and Immune Fuzzy C-means Algorithm (IFA). Analyze the advantages and disadvantages of the algorithms by convergence value of objective function and convergence iterations. The results suggest that GIFA could achieve better clustering results.
APA, Harvard, Vancouver, ISO, and other styles
44

Chang, Chen-Yi, and 張陳益. "Adaptive Indoor Radiomap Localization using Hybrid Clustering-based Regression." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/huewev.

Full text
Abstract:
碩士
國立交通大學
電信工程研究所
108
Location estimation has received extensive attention in the view of the emerging demand for Location-Based Service (LBS) in indoor environments. WiFi fingerprinting is the most commonly used technique for its simple implementation and its ability to provide networking service. However, timevarying factors such as human effect and unstable received signal strength (RSS) from Access Points (APs) limit the performance of indoor LBS. Besides, to collect a suitable number and physical locations of Reference Points (RPs) is time-consuming for the fingerprinting system. To address the problem, we propose the RSS-Oriented Map-Assisted RPs Clustering (ROMARC) algorithm to cluster RPs and provide appropriate numbers and locations of monitor points (MPs) where receive RSS all the time. Different from most of the clustering schemes for RSS fingerprinting system, the ROMARC algorithm is designed to find RPs which have a similar variation of RSS value. Then, cluster-based online database establishment (CODE) algorithm adopt learning-based regression algorithm to construct a real-time database based on results of ROMARC algorithm. The result obtained by CODE algorithm can achieve the required positioning accuracy. Furthermore, we propose the cluster-based feature scaling weighted KNN (CFS-WkNN) algorithm to estimate target’s location. For performance evaluation, simulation and implementation results shows that our proposed system can provide better location estimation than the expired database in the time-variant environment.
APA, Harvard, Vancouver, ISO, and other styles
45

Ping-Hsun, Hsieh. "DESIGN OF HYBRID- CLUSTERING ALGORITHM FOR LOW POWER SCAN CHAINS." 2005. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2206200514134900.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Jhang, Min-cin, and 張敏勤. "Using a New Hybrid Genetic Algorithm to the Clustering Problem." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/07273402528108736065.

Full text
Abstract:
碩士
國立臺灣科技大學
電子工程系
94
Clustering in data mining is very useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based similarity measure in order to partition the database such that data points in the same partition are more similar than points in different partitions. In this paper, we proposed a new clustering algorithm which inherited Hybrid K-medoid Algorithm (H.K.A). H.K.A consisted of the combination of genetic algorithm and traditional clustering algorithm such that it was able to solving the clustering problem. New clustering algorithm was the improvement of H.K.A, with some modification in Local Search Heuristic and Mutation Operator. New clustering algorithm run faster than H.K.A in evolutionary processes, and it could also execute more efficiently for numerical data set in cluster analysis with the better clustering results. After two algorithms experimented on twelve data set, the experimental results showed that our proposed algorithm could find the better clustering results with much less generations and time cost. Thus, these revealed the advantage of our proposed algorithm in resolving clustering problem.
APA, Harvard, Vancouver, ISO, and other styles
47

Cheng, Hung-Lien, and 程閎廉. "A Hybrid Collaborative Filtering Recommender System Based on Clustering Algorithm." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/52770657142827560926.

Full text
Abstract:
碩士
國立中興大學
資訊科學與工程學系所
98
Collaborative recommender is one of the most popular recommendation techniques. Traditional collaborative filtering approach mainly employs a matrix of user’s ratings on items to calculate the similarity between users. If the features of users or items are provided in the data set in addition to the rating data, then those features can be used to improve the quality of recommendations. In this thesis, we proposed a hybrid recommender system based on clustering and collaborative filtering techniques. In the proposed system, items are clustered based on item features and user-item rating matrix. Similarly, users are clustered based on the user’s preferred categories of items and user-item rating matrix. Then a hybrid method that combines content-based and collaborative filtering is proposed to predict the rating of an item for a given user. The experimental results show that the proposed method has higher accuracy in terms of mean absolute error than that of User-based collaborative filtering approach, Item-based filtering approach, Clustering Items for Collaborative Filtering (CICF), and the User Profile Clustering (UPC) method. Especially, when the dataset is sparse, the accuracy of the proposed method is better and more stable than the other methods.
APA, Harvard, Vancouver, ISO, and other styles
48

Hsieh, Ping-Hsun, and 謝秉勳. "DESIGN OF HYBRID- CLUSTERING ALGORITHM FOR LOW POWER SCAN CHAINS." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/66667558325648869139.

Full text
Abstract:
碩士
國立臺灣大學
電機工程學研究所
93
The use of scan-based architectures is wide-spreading in circuit testing processes nowadays, yet expensive in power consumption. Scan chain reordering techniques have been utilized for years to reduce power dissipation in traditional DfT (Design for Test); nevertheless, one of the main concerns, namely the length of scan routing, has received a plenty of attention for the reason of a tradeoff existing between power reduction and length reduction of wire connections. Hence, in this thesis, a hybrid clustering algorithm named ISAC (Intrinsic Structure Approximation-based Clustering) consisting of OPTICS and k-means is proposed. ISAC adopts information, obtained by OPTICS, from the intrinsic structure of the distribution associated with scan cells to determine the number of clusters generated by k-means in which k compact circle-like clusters are formed. A property of geometry has been proved that given a diameter, a circle-like cluster can cover the maximum area; thereby it might be able to contain as many cells as possible. Results from our quantitative simulations in the benchmark circuit s9234 have demonstrated the efficiency of ISAC in both power reduction and length saving; both reduced up to 16.563% and 65.989%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
49

Wang, Yu-shao, and 王欲韶. "Hybrid Swarm Intelligence Algorithms with Biodiversity Applied to Data Clustering." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/67637503971267115429.

Full text
Abstract:
碩士
國立高雄第一科技大學
資訊管理研究所
100
This study proposes innovative hybrid models with diversity both in the subpopulations and in the structure of hybridization to prevent from early convergence in the computing process. Several models with different levels of biodiversity were implemented and compared in this study via the combination of swarm intelligence algorithms as well as genetic algorithms. The proposed hybrid models were applied to data clustering using the UCI public datasets. Experimental results show that the proposed hybrid systems with high biodiversity improve the performance of data clustering.
APA, Harvard, Vancouver, ISO, and other styles
50

Babu, T. Ravindra. "Large Data Clustering And Classification Schemes For Data Mining." Thesis, 2006. https://etd.iisc.ac.in/handle/2005/440.

Full text
Abstract:
Data Mining deals with extracting valid, novel, easily understood by humans, potentially useful and general abstractions from large data. A data is large when number of patterns, number of features per pattern or both are large. Largeness of data is characterized by its size which is beyond the capacity of main memory of a computer. Data Mining is an interdisciplinary field involving database systems, statistics, machine learning, visualization and computational aspects. The focus of data mining algorithms is scalability and efficiency. Large data clustering and classification is an important activity in Data Mining. The clustering algorithms are predominantly iterative requiring multiple scans of dataset, which is very expensive when data is stored on the disk. In the current work we propose different schemes that have both theoretical validity and practical utility in dealing with such a large data. The schemes broadly encompass data compaction, classification, prototype selection, use of domain knowledge and hybrid intelligent systems. The proposed approaches can be broadly classified as (a) compressing the data by some means in a non-lossy manner; cluster as well as classify the patterns in their compressed form directly through a novel algorithm, (b) compressing the data in a lossy fashion such that a very high degree of compression and abstraction is obtained in terms of 'distinct subsequences'; classify the data in such compressed form to improve the prediction accuracy, (c) with the help of incremental clustering, a lossy compression scheme and rough set approach, obtain simultaneous prototype and feature selection, (d) demonstrate that prototype selection and data-dependent techniques can reduce number of comparisons in multiclass classification scenario using SVMs, and (e) by making use of domain knowledge of the problem and data under consideration, we show that we obtaina very high classification accuracy with less number of iterations with AdaBoost. The schemes have pragmatic utility. The prototype selection algorithm is incremental, requiring a single dataset scan and has linear time and space requirements. We provide results obtained with a large, high dimensional handwritten(hw) digit data. The compression algorithm is based on simple concepts, where we demonstrate that classification of the compressed data improves computation time required by a factor 5 with prediction accuracy with both compressed and original data being exactly the same as 92.47%. With the proposed lossy compression scheme and pruning methods, we demonstrate that even with a reduction of distinct sequences by a factor of 6 (690 to 106), the prediction accuracy improves. Specifically, with original data containing 690 distinct subsequences, the classification accuracy is 92.47% and with appropriate choice of parameters for pruning, the number of distinct subsequences reduces to 106 with corresponding classification accuracy as 92.92%. The best classification accuracy of 93.3% is obtained with 452 distinct subsequences. With the scheme of simultaneous feature and prototype selection, we improved classification accuracy to better than that obtained with kNNC, viz., 93.58%, while significantly reducing the number of features and prototypes, achieving a compaction of 45.1%. In case of hybrid schemes based on SVM, prototypes and domain knowledge based tree(KB-Tree), we demonstrated reduction in SVM training time by 50% and testing time by about 30% as compared to complete data and improvement of classification accuracy to 94.75%. In case of AdaBoost the classification accuracy is 94.48%, which is better than those obtained with NNC and kNNC on the entire data; the training timing is reduced because of use of prototypes instead of the complete data. Another important aspect of the work is to devise a KB-Tree (with maximum depth of 4), that classifies a 10-category data in just 4 comparisons. In addition to hw data, we applied the schemes to Network Intrusion Detection Data (10% dataset of KDDCUP99) and demonstrated that the proposed schemes provided less overall cost than the reported values.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography