Rozprawy doktorskie na temat „Hybrid data mining”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Hybrid data mining”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Daglar, Toprak Seda. "A New Hybrid Multi-relational Data Mining Technique". Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606150/index.pdf.
Pełny tekst źródłaSeetan, Raed. "A Data Mining Approach to Radiation Hybrid Mapping". Diss., North Dakota State University, 2014. https://hdl.handle.net/10365/27315.
Pełny tekst źródłaZall, Davood. "Visual Data Mining : An Approach to Hybrid 3D Visualization". Thesis, Högskolan i Borås, Institutionen Handels- och IT-högskolan, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-16601.
Pełny tekst źródłaProgram: Magisterutbildning i informatik
Yang, Pengyi. "Ensemble methods and hybrid algorithms for computational and systems biology". Thesis, The University of Sydney, 2012. https://hdl.handle.net/2123/28979.
Pełny tekst źródłaTheobald, Claire. "Bayesian Deep Learning for Mining and Analyzing Astronomical Data". Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0081.
Pełny tekst źródłaIn this thesis, we address the issue of trust in deep learning predictive systems in two complementary research directions. The first line of research focuses on the ability of AI to estimate its level of uncertainty in its decision-making as accurately as possible. The second line, on the other hand, focuses on the explainability of these systems, that is, their ability to convince human users of the soundness of their predictions.The problem of estimating the uncertainties is addressed from the perspective of Bayesian Deep Learning. Bayesian Neural Networks assume a probability distribution over their parameters, which allows them to estimate different types of uncertainties. First, aleatoric uncertainty which is related to the data, but also epistemic uncertainty which quantifies the lack of knowledge the model has on the data distribution. More specifically, this thesis proposes a Bayesian neural network can estimate these uncertainties in the context of a multivariate regression task. This model is applied to the regression of complex ellipticities on galaxy images as part of the ANR project "AstroDeep''. These images can be corrupted by different sources of perturbation and noise which can be reliably estimated by the different uncertainties. The exploitation of these uncertainties is then extended to galaxy mapping and then to "coaching'' the Bayesian neural network. This last technique consists of generating increasingly complex data during the model's training process to improve its performance.On the other hand, the problem of explainability is approached from the perspective of counterfactual explanations. These explanations consist of identifying what changes to the input parameters would have led to a different prediction. Our contribution in this field is based on the generation of counterfactual explanations relying on a variational autoencoder (VAE) and an ensemble of predictors trained on the latent space generated by the VAE. This method is particularly adapted to high-dimensional data, such as images. In this case, they are referred as counterfactual visual explanations. By exploiting both the latent space and the ensemble of classifiers, we can efficiently produce visual counterfactual explanations that reach a higher degree of realism than several state-of-the-art methods
Cheng, Xueqi. "Exploring Hybrid Dynamic and Static Techniques for Software Verification". Diss., Virginia Tech, 2010. http://hdl.handle.net/10919/26216.
Pełny tekst źródłaPh. D.
Viademonte, da Rosa Sérgio I. (Sérgio Ivan) 1964. "A hybrid model for intelligent decision support : combining data mining and artificial neural networks". Monash University, School of Information Management and Systems, 2004. http://arrow.monash.edu.au/hdl/1959.1/5159.
Pełny tekst źródłapande, anurag. "ESTIMATION OF HYBRID MODELS FOR REAL-TIME CRASH RISK ASSESSMENT ON FREEWAYS". Doctoral diss., University of Central Florida, 2005. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3016.
Pełny tekst źródłaPh.D.
Department of Civil and Environmental Engineering
Engineering and Computer Science
Civil Engineering
Sainani, Varsha. "Hybrid Layered Intrusion Detection System". Scholarly Repository, 2009. http://scholarlyrepository.miami.edu/oa_theses/44.
Pełny tekst źródłaZhang, Jiapu. "Derivative-free hybrid methods in global optimization and their applications". Thesis, University of Ballarat, 2005. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34054.
Pełny tekst źródłaDoctor of Philosophy
Hussain, Mukhtar. "Data-driven discovery of mode switching conditions to create hybrid models of cyber-physical systems". Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/235043/1/Mukhtar_Hussain_Thesis.pdf.
Pełny tekst źródłaBarak, Sasan. "Technical and Fundamental Features’ analysis for Stock Market Prediction with Data Mining Methods". Doctoral thesis, Università degli studi di Bergamo, 2019. http://hdl.handle.net/10446/128764.
Pełny tekst źródłaParamasivam, Vijayajothi. "Conceptual framework of a novel hybrid methodology between computational fluid dynamics and data mining techniques for medical dataset application". Thesis, Curtin University, 2017. http://hdl.handle.net/20.500.11937/54143.
Pełny tekst źródłaCheng, Iunniang. "Hybrid Methods for Feature Selection". TopSCHOLAR®, 2013. http://digitalcommons.wku.edu/theses/1244.
Pełny tekst źródłaLin, Pengpeng. "A Framework for Consistency Based Feature Selection". TopSCHOLAR®, 2009. http://digitalcommons.wku.edu/theses/62.
Pełny tekst źródłaAlsalama, Ahmed. "A Hybrid Recommendation System Based on Association Rules". TopSCHOLAR®, 2013. http://digitalcommons.wku.edu/theses/1250.
Pełny tekst źródłaPagnossim, José Luiz Maturana. "Uma abordagem híbrida para sistemas de recomendação de notícias". Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/100/100131/tde-07062018-101232/.
Pełny tekst źródłaRecommendation Systems (RS) are software capable of suggesting items to users based on the history of user interactions or by similarity metrics that can be compared by item, user, or both. There are different types of RS and those which most interest in this work are content-based, knowledge-based and collaborative filtering. Achieving adequate results to user\'s expectations is a hard goal due to the inherent subjectivity of human behavior, thus, the RS need efficient and effective solutions to: modeling the data that will support the recommendation; the information retrieval that describes the data; combining this information within similarity, popularity or suitability metrics; creation of descriptive models of the items under recommendation; and evolution of the systems intelligence to learn from the user\'s interaction. Decision-making by a RS is a complex task that can be implemented according to the view of fields such as artificial intelligence and data mining. In the artificial intelligence field there are studies concerning the method of case-based reasoning that works with the principle that if something worked in the past, it may work again in a new similar situation the one in the past. The case-based recommendation works with structured items, represented by a set of attributes and their respective values (within a ``case\'\' model), providing known and adapted solutions. Data mining area can build descriptive models to RS and also handle, manipulate and analyze textual data, constituting one option to create elements to compose a recommendation. One way to minimize the weaknesses of an approach is to adopt aspects based on a hybrid solution, which in this work considers: taking advantage of the different types of RS; using problem-solving techniques; and combining resources from different sources to compose a unified metric to be used to rank the recommendation by relevance. Among the RS application areas, news recommendation stands out, being used by a heterogeneous public, ample and demanding by relevance. In this context, the this work shows a hybrid approach to news recommendations built through a architecture implemented to prove the concepts of a recommendation system. This architecture has been validated by using a news corpus and by performing an online experiment. Through the experiment it was possible to observe the architecture capacity related to the requirements of a news recommendation system and architecture also related to privilege recommendations based on similarity, popularity, diversity, novelty and serendipity. It was also observed an evolution in the indicators of reading, likes, acceptance and serendipity as the system accumulated a history of preferences and solutions. Through the analysis of the unified metric for ranking, it was possible to confirm its efficacy when verifying that the best classified news in the ranking was the most accepted by the users
Jiang, Xinxin. "Mining heterogeneous enterprise data". Thesis, 2018. http://hdl.handle.net/10453/129377.
Pełny tekst źródłaHeterogeneity is becoming one of the key characteristics inside enterprise data, because the current nature of globalization and competition stress the importance of leveraging huge amounts of enterprise accumulated data, according to various organizational processes, resources and standards. Effectively deriving meaningful insights from complex large-scaled heterogeneous enterprise data poses an interesting, but critical challenge. The aim of this thesis is to investigate the theoretical foundations of mining heterogeneous enterprise data in light of the above challenges and to develop new algorithms and frameworks that are able to effectively and efficiently consider heterogeneity in four elements of the data: objects, events, context, and domains. Objects describe a variety of business roles and instruments involved in business systems. Object heterogeneity means that object information at both the data and structural level is heterogeneous. The cost-sensitive hybrid neural network (Cs-HNN) proposed leverages parallel network architectures and an algorithm specifically designed for minority classification to generate a robust model for learning heterogeneous objects. Events trace an object’s behaviours or activities. Event heterogeneity reflects the level of variety in business events and is normally expressed in the type and format of features. The approach proposed in this thesis focuses on fleet tracking as a practical example of an application with a high degree of event heterogeneity. Context describes the environment and circumstances surrounding objects and events. Context heterogeneity reflects the degree of diversity in contextual features. The coupled collaborative filtering (CCF) approach proposed in this thesis is able to provide context-aware recommendations by measuring the non-independent and identically distributed (non-IID) relationships across diverse contexts. Domains are the sources of information and reflect the nature of the business or function that has generated the data. The cross-domain deep learning (Cd-DLA) proposed in this thesis provides a potential avenue to overcome the complexity and nonlinearity of heterogeneous domains. Each of the approaches, algorithms, and frameworks for heterogeneous enterprise data mining presented in this thesis outperform the state-of-the-art methods in a range of backgrounds and scenarios, as evidenced by a theoretical analysis, an empirical study, or both. All outcomes derived from this research have been published or accepted for publication, and the follow-up work has also been recognised, which demonstrates scholarly interest in mining heterogeneous enterprise data as a research topic. However, despite this interest, heterogeneous data mining still holds increasing attractive opportunities for further exploration and development in both academia and industry.
Babu, T. Ravindra. "Large Data Clustering And Classification Schemes For Data Mining". Thesis, 2006. https://etd.iisc.ac.in/handle/2005/440.
Pełny tekst źródłaBabu, T. Ravindra. "Large Data Clustering And Classification Schemes For Data Mining". Thesis, 2006. http://hdl.handle.net/2005/440.
Pełny tekst źródła蔡明憲. "A Hybrid Data Mining Model for Customer Retention". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/25689304585306477235.
Pełny tekst źródła國立臺灣科技大學
電子工程系
90
Competition in the wireless telecommunications industry is fierce. To maintain profitability, wireless carriers must control churn, which is the loss of subscribers who switch from one carrier to another. This thesis proposes a hybrid architecture that tackles the complete customer retention problem, in the sense that it not only predicts churn probability but also proposes retention policies. The architecture works in two modes, namely, the learning and usage modes. In the learning mode, the churn model learner learns potential associations inside the historical subscriber database to form a churn model. The policy model constructor then uses the attributes that appear in the churn model to segment all churners into distinct groups. It is also responsible for developing a specific policy model for each churner group. In the usage mode, the churner predictor uses the churn model to predict the churn probability of a given subscriber. A high churn probability will cause the churner predictor to invoke the policy maker to suggest specific retention policies according to the policy model. Our experiments illustrate that the learned churner model has around 85% of correctness in evaluation. Currently, we have no proper data to evaluate the constructed policy model. The construction process, however, signifies an interesting and important approach toward a better support in retaining possible churners. This work is significant since the state-of-the-art technology only focuses on how to increase the accuracy of churn prediction. They either never touched the issue of retention policies, or only proposed policies according to the path conditions of the decision tree, the churn model. Our policy model construction process goes on step further to investigate the concept of churner groups, which equivalently digs out the associations between the paths of the decision tree. We believe with this in depth knowledge about how churns are related, we can propose better retention policy models for possible churners.
Tzu-Fan, Tang, i 湯子範. "A hybrid data mining approach for customer relationship management". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/20933792886165601712.
Pełny tekst źródła國立中正大學
會計與資訊科技研究所
96
It is the fact that current domestic and foreign enterprises have been facing an unprecedented competition. The ‘product-oriented’ model has been transferred to the ‘customer-oriented’ one. This results in the importance of Customer relationship management (CRM). Customer retention is one major problem in CRM. Data mining techniques have been applied to predict the loss of customers (or customer churn). In literature, they have been proven its applicability in customer churn prediction. In this thesis, hybrid data mining methods are developed in order to improve current single prediction models. In particular, two (different) techniques are combined in sequence, which leads to two stages of training or learning. This research considers Self-Organizing Maps (SOM) and Artificial Neural Networks (ANN) for the first component of the hybrid models respectively. Then, the second component as the prediction model to produce the final output is based on ANN. The baseline to be compared with the two hybrid models are based on the single ANN without combining with the fist component. The experimental result shows that hybrid models outperform the baseline model in terms of prediction accuracy. In particular, ANN combined with ANN performs the best, which provides 93% prediction accuracy. in addition, it provides the lowest Type I and II error rates.
Yang, Ren-fu, i 楊仁富. "Hybrid Data Mining and MSVM for Short Term Load Forecasting". Thesis, 2010. http://ndltd.ncl.edu.tw/handle/28661440234858506249.
Pełny tekst źródła國立中山大學
電機工程學系研究所
98
The accuracy of load forecast has a significant impact for power companies on executing the plan of power development, reducing operating costs and providing reliable power to the client. Short-term load forecasting is to forecast load demand for the duration of one hour or less. This study presents a new approach to process load forecasting. A Support Vector Machine (SVM) was used for the initial load estimation. Particle Swarm Optimization (PSO) was then adopted to search for optimal parameters for the SVM. In doing the load forecast, training data is the most important factor to affect the calculation time. Using more data for model training should provide a better forecast results, but it needs more computing time and is less efficient. Applications of data mining can provide means to reduce the data requirement and the computing time. The proposed Modified Support Vector Machines approach can be proved to provide a more accurate load forecasting.
Chen, Lei Chun, i 陳蕾淳. "A Hybrid Data Mining Model in Analyzing Corporate Social Responsibility". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/07858790424620614969.
Pełny tekst źródła國立暨南國際大學
資訊管理學系
101
Over the past two decades, Corporate Social Responsibility (CSR) has received worldwide attention. Publication of CSR Reports has become the trend for domestic and foreign enterprises. In the constantly changing competition environment, it will be focus of public attention that how enterprises to play the role of corporate citizenship and to achieve a balance in profit, environmental and charitable activities. However, most of previous quantitative studies of CSR concentrate on traditional statistic approaches. The data mining technique has not been widely explored in this area. Thus, this investigation proposed a hybrid data mining CSFSC model integrating data preprocessing approaches, a classification method, and a rule generation mechanism. The data preprocessing approaches include Correlation-based Feature Selection(CFS), Synthetic Minority Over-sampling Technique (SMOTE) and Fuzzy C-Means (FCM) clustering algorithm. The One-Against-One Support Vector Machine (OAOSVM) method was employed as a classifier for performing multi-classification task. The rule-based learning algorithm C5.0 was utilized to generate rules from the results of OAOSVM model. CSR data collected from China’s Listed Firms in 2010 were employed to examine the performance of the proposed model. The empirical results showed that the designed CSFSC model can yield satisfactory classification accuracy as well as provide rules for decision makers. Therefore, the presented CSFSC model is a feasible and effective alternative in analyzing CSR.
Lee, Chia-Hsun, i 李嘉訓. "A Hybrid Data Mining Approach to Quality Control of Machining Process". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/31833725094107767456.
Pełny tekst źródła國立暨南國際大學
資訊管理學系
94
Nowadays quality is one of the best sources of competitive advantage. High quality performance is becoming of critical importance. Quality control is a process employed to ensure a certain level of quality in a product or service. One of the techniques in quality control is to predict the product quality abased on the product features. However, traditional quality control techniques have some weaknesses such as specific control limits, heavily on the collection and analysis of data and uncertainty processing. In order to promote the effectiveness of quality control, an agent-based hybrid approach incorporated with the rough set theory (RST), fuzzy logic and genetic algorithm is proposed in this thesis. In this agent-based system, each agent is able to perform one or more functionality in three stages: The feature & rule extraction stage is a RST procedure which used to extract significant features and decision rules. The quality prediction stage is used to develop a FLS to predict machining part quality. The optimization stage is to search the optimal solution of the FLS.
Lu, Chi-Jie, i 呂奇傑. "Hybrid Neural Network Classification Techniques in the Application of Data Mining". Thesis, 2001. http://ndltd.ncl.edu.tw/handle/37742016816981561250.
Pełny tekst źródła輔仁大學
應用統計學研究所
89
Data mining is the art of finding patterns in data and is a new approach based on a general recognition that there is undraped value in large databases and utilities data-driven extraction of information. However, it is still not easy to identify the complicate relationship in the huge data set. Moreover, in most case, the estimation of parameters or the classification results can not really describe the realization of business modeling. The artificial neural network is becoming a very popular alternative in prediction and classification task due to its associated memory characteristic and generalization capability. However, neural network has been criticized by its long training process in the application of classification problems. In order to solve the above-mentioned drawback, the proposed study trying to explore the performance of data classification by integrating the artificial neural networks technique with the linear discriminant analysis and fuzzy discriminant analysis approach respectively. To demonstrate the inclusions of the classification results from the linear discriminant and fuzzy discriminant analysis would improve the classification accuracy of the designed neural networks, classification tasks are performed on two data sets, the often used Iris data and one practical bank credit card data. As the results reveal, the two proposed integrated approach provides a better initial solution and hence converges much faster than the conventional neural networks. Besides, in comparison with the traditional neural network approach, the classification accuracies increase for both cases in terms of the two proposed methodology. Moreover, the superiority of the proposed technique can be observed by comparing the classification results using only linear discriminant or fuzzy discrimintant analysis approaches.
Chen, Hsiao-ming, i 陳小明. "Prevention of Drug Dispensing Errors by Using Hybrid Data Mining Approaches". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/61860258367373868836.
Pełny tekst źródła國立成功大學
資訊工程學系碩博士班
96
One important issue in medical care is the prevention of drug dispensing errors since they caused numerous injuries and deaths with expensive cost. In this thesis, we propose a hybrid data mining approach with an implemented system to solve this problem. Our approach consists of two main modules, HDMmodel and HDMclustering. In HDMmodel, J48 and logistic regression are used to derive the decision tree and regression function from the given dispensing error cases and drug database. In HDMclustering, similar drugs, which are easily confused with each other, are then gathered together into clusters by the clustering technique named PoCluster and the extracted logistic regression function. Risky drug pairs that may cause dispensing errors are then alerted in our implemented system with interpretable prevention rules. Finally, by the experimental evaluation on real datasets in a medical center, our approach is shown to be capable of diagnosing the potential dispensing errors effectively.
Fan, Ching-Yi, i 范景怡. "Applying Data Mining Techniques to Combine Predictions in Hybrid Recommender Systems". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/92064178136185259961.
Pełny tekst źródła中國文化大學
資訊管理學系碩士在職專班
101
Nowadays, the Recommender System has been developed in several different ways for operating. The main techniques are used to develop Recommender System: CB (Content-Based), CF (Collaborative Filtering) and DF (Demographic Filtering). However, each technique has its advantages and limitations. For this reason, many scholars have proposed combine several techniques, intended to reduce the disadvantages of a single method, and achieve more precise recommendation. Currently, the main techniques are used to develop Recommender System, mostly according to the experience of the past research or heuristic method. It lacks of rigorous theoretical foundation. Therefore, this study hopes to use the concepts of CB and CF, plus DF techniques, combining the Data Mining techniques (i.e., Linear Regression, Neural Networks) with the predication. To sum up, it will provide a more accurate prediction than one single technique, and overcome the limitations of each respective potential problem.
Shu, I.-Ping, i 徐一平. "Study of Hybrid Data Mining Techniques Applied for Filtering Spam Mail". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/00992363360301001047.
Pełny tekst źródła華梵大學
資訊管理學系碩士班
97
The network has been established and developed since 1970; people have generally used the network. People artificially delivered mail before, but this tendency was transferred to E-mail. The time and distance of communication were decreased by E-mail, and E-mail gradually changed our live and working way. At this moment, some beneficial people use the malicious programs or collect the email boxes in many ways, then send email arbitrarily. It has been perplexed to the receiver. This study (GA/DT) adopts the genetic algorithm (Genetic Algorithms, GA) and decision tree (Decision Tree, DT) of data mining techniques to select Minimum Case and Pruning CF parameters. Experiment results indicate that the accuracy of the hybrid GA/DT algorithm is 95.0%. In other algorithms, Logistic Algorithm has a better accuracy of 5.0% and ANN has an accuracy of 2.337 % and SVM has an accuracy of 4.0 %, and it shows that the GA/DT algorithm can accurately select the Minimum Case and Pruning CF parameters in the DT algorithm and effectively enhance the performance of identifying spam mails.
呂奇傑. "Hybrid Neural Network Classification Techniques in the Application of Data Mining". Thesis, 2001. http://ndltd.ncl.edu.tw/handle/76303042740984553449.
Pełny tekst źródła輔仁大學
應用統計研究所
89
Data mining is the art of finding patterns in data and is a new approach based on a general recognition that there is undraped value in large databases and utilities data-driven extraction of information. However, it is still not easy to identify the complicate relationship in the huge data set. Moreover, in most case, the estimation of parameters or the classification results can not really describe the realization or business modeling. The artificial neural network is becoming a very popular alternative in prediction and classification task due to its associated memory. characteristic and generalization capability. However, neural network has been criticized by its long training process in the application of classification problems. In order to solve the above-mentioned drawback, the proposed study trying to explore the performance of data classification by integrating the artificial neural networks technique with the linear discriminant analysis and fuzzy discriminant analysis approach respectively. To demonstrate the inclusions of the classification results from the linear discriminant and fuzzy discriminant analysis would improve the classification accuracy of the designed neural networks, classification tasks are performed on two data sets, the often used Iris data and one practical bank credit card data. As the results reveal, the two proposed integrated approach provides a better initial solution and hence converges much faster than the conventional neural networks. Besides, in comparison with the traditional neural network approach, the classification accuracies increase for both cases in terms of the two proposed methodology. Moreover, the superiority of the proposed technique can be observed by comparing the classification results using only linear discriminant or fuzzy discrimintant analysis approaches.
Kumar, Nishant. "Sentiment Analysis Using Hybrid Machine Learning Technique". Thesis, 2016. http://ethesis.nitrkl.ac.in/8616/1/2016_MT_214CS3513_Nishant_Kumar.pdf.
Pełny tekst źródłaKang, Shu-Tyng, i 康舒婷. "Applying Hybrid Data Mining Approach to Develop a Cerebrovascular Disease Prediction Model". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/70886576446134037009.
Pełny tekst źródła國立臺灣科技大學
工業管理系
101
With Taiwan’s economic take-off, Taiwanese people gradually placed importance on the health and medical issues. According to the data reported by WHO, stroke has become a big threat of health in the developed countries since 1999. In Taiwan, stroke is the third of the top ten causes of deaths. Therefore, how to prevent and discover stroke is very important issue now. The best way to examine and diagnose stroke is using the brain image examination and the carotid ultrasound. However, the price of these examinations is excessively higher than others. If people didn’t have any advice from doctor; they have to pay all the expenses for these examinations. It’s the main reason that some people are not willing to do these brain examinations. Now, we used the brain examination data provided by one hospital which is located in Taipei. We do feature selection and find out which features are important to the cause of stroke by using a hybrid method which is combined with data mining technology and meta-heuristic algorithm (including genetic algorithm, particle swarm optimization and back-propagation network). Finally, we use these features to develop a cerebrovascular disease prediction model. The cerebrovascular disease prediction model can support doctors to give people some advises whether to do the brain examination or not. People can know the state of their brain health, prevent and cure as soon as possible.
Herani, Inggi Rengganing, i Inggi Rengganing Herani. "Development of Carotid Artery Diagnostic Prediction Model using Hybrid Data Mining Approach". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/67320389316269101304.
Pełny tekst źródła國立臺灣科技大學
工業管理系
101
Carotid artery disease is the main caused of disability and death related with stroke or cerebrovascular disease, and in the worldwide medical issue, stroke was responsible for the high number of death. Because there are no symptoms of carotid artery disease, it is important to perform medical test using ultrasound or imaging method to visualize the carotid arteries. This kind of test is uncomfortable, expensive, and has some risks. Therefore, to reduce the risks and economic issue, this research presents method that generates some important information for the doctor to diagnose the carotid artery disease. Hybrid data mining approach is applied to produce some combination models. Dataset in real world are often imbalance. It dominated by normal data and only small percentage of abnormal or sick data. To overcome the imbalance dataset, we used Synthetic Minority Over-Sampling Technique (SMOTE) and Simple K-Means Clustering. While SMOTE is used to over-sampling the minority data, Clustering is used to under-sampling the majority data. Genetic Algorithm and Gain Ratio also used for selecting important features. These methods emphasized on selecting subset of salient features and reduced the number of features. Towards the end, new dataset would be processed using Back Propagation Network (BPN), Naive Bayes, and Decision Tree to predict the accuracy of the disease. Experimental results show that these hybrid methods achieved high accuracy, so it can assist doctors to analyze and predict the presence of carotid artery disease in patients.
Feng, Hsin-lan, i 馮欣嵐. "Applying a Hybrid Data Mining Approach to Develop a Stroke Prediction Model". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/75650209307133005293.
Pełny tekst źródła國立臺灣科技大學
工業管理系
101
Stroke has become a big threat of health for people worldwide, the death rate and disable rate of stroke are both high. Therefore, how to prevent stroke and discover it is an important issue now. The best way to examine and discover stroke is the brain image examination and ultrasound, however, the price of these examinations is relatively high. People won’t take these examinations if there is no advice from doctor or no obvious symptom people feel. Consequently, we want to use normal healthy examination that is cheaper and easy to take to be the basic of our research, using hybrid data mining techniques to find the association between normal healthy examination and stroke. And adding some suggestion to the normal healthy examination report, hope to provide more information to the public. We use the brain examination data from 2004 to 2011 to develop a Stroke-Risk-Predicting-Assistance Model by BPN. First, we do the clustering under sampling, and then find the relative feature by rough set theory, information gain and gain ratio. Finally, we use Taguchi method to set the best parameter for BPN. The Stroke-Risk-Predicting-Assistance Model can support doctor to give people some advise whether to do the brain examination or not, And to maximum the value of normal healthy examination. People can know their brain health state, and prevent or cure the stroke as soon as possible.
Wang, Yu-Chung, i 王鈺中. "Evaluating Renewable Energy Policies Using Hybrid Data Mining and Analytic Hierarchy Process Modeling". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/68531396159632779020.
Pełny tekst źródła國立清華大學
工業工程與工程管理學系
102
When a large percentage of energy (>90%) is generated by fossil fuel, carbon dioxide emissions increase the greenhouse effect. Therefore, renewable, sustainable, and economically viable energy sources are needed as alternatives to fossil fuels. The facilities and installation costs for generating renewable energy is much higher than the cost of fossil fuel facilities. Thus, governments need effective policies, regulations, and incentive programs to promote the usage of renewable energy. Renewable energy can be classified into different categories, including offshore and onshore wind power, photovoltaic solar, and geothermal. The policies used for promoting specific categories vary significantly. These policies depend on the policy goals, regulations, taxation, incentives and promotional schemes. The purpose of this study is to apply clustering techniques and AHP to analyze types of renewable energies and their attributes with respect to economic factors, energy resource and supply, and environmental effects. AHP method is used to evaluate actions that can resolve challenges found in development of renewable energy. The study provides scientific results to help the government plan renewable energy policies. The data for the case study are collected from Taiwan’s renewable energy statistics related to PV cells, wind farms, ocean thermal energy, geothermal energy, hydro power, and solid waste fuels. The research will have four major results and findings. (1) Constructing models for analyzing renewable energy policies using data mining techniques, (2) Using seven categories of renewable energy sources, i.e., wind power, photovoltaic, geothermal and solid waste power in Taiwan, as specific renewable energy types to find the best promotional policy. (3) Providing reliable advice to government (and the means to effectively analyze given scenarios) for policy planning and execution. (4) Giving suggestions of the renewable policy from some benchmarking countries and providing some strategies from another countries.
Tseng, Jui-Chih, i 曾瑞智. "A Hybrid Data Mining Approach to Construct the Target Customers Choice Reference Model". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/26022623832684281818.
Pełny tekst źródła大同大學
資訊經營學系(所)
101
Marketing, the prevailing commercial activity of enterprises, is an important strategy to increase customer loyalty and potential customer for more profit. To maximize profit with limited resources, it would be more profitable for enterprises to choose the right target customers. Therefore, it is necessary to build up an efficient, objective and accurate target customer choice model. Using data mining techniques to find the target customers is a traditional way. However, researches in the past mainly focused on finding the high accuracy classifier, but different classifiers perform differently in varied situations. So this study is to propose an integrated choice of target customer model, integrating support vector machine, neural network and K-Means algorithm into a two-phase analysis model. This model is expected to enhance classification accuracy and reduce Type I and Type II errors at the same time. The research results indicate that the integrated model is effective in simultaneously enhancing classification accuracy and reducing Type I and Type II errors.
蔡永順. "An RFID-based Data Mining Using Hybrid and Heuristic Methods for Quality Management". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/82687441413491623671.
Pełny tekst źródła國立交通大學
資訊管理研究所
101
Many enterprises are confronting global competition and shortened life cycle of new products now. Therefore, if they can not master product quality, they will delay the product development as well as time to market and can not even provide product variety immediately. The data mining can find hidden knowledge patterns in data and enable complex business processes to be understood and reengineered. In addition, RFID can effortlessly turn every object into mobile network nodes which can be tracked, traced, monitored, trigger actions, or respond to action requests. Therefore, this study focused on the integration and application of data mining and RFID system for quality management. The purpose of this study was to propose an RFID-based hybrid heuristic data mining (RHHDM) framework to discover hidden and meaningful knowledge rules for product quality and reengineer the quality management processes in order to help enterprises enhance product quality. The RHHDM framework primarily utilized RFID, manufacturing execution system (MES), genetic algorithm (GA), artificial neural network (ANN) based on back propagation network (BPN) algorithm, decision tree algorithm, and Bayesian classification algorithm. In the RHHDM framework, the MES storing product quality data was the data source of data mining system. This study proposed three types of algorithms to act as the nucleus of data mining engines for data classification and prediction. These algorithms were divided into experiment group and contrast group in the data mining system. In experiment group, this study utilized artificial intelligence approaches to propose hybrid heuristic methods integrating the GA and BPN (GABPN) algorithm. In contrast group, this study utilized statistical approaches to propose the decision tree algorithm and Bayesian classification algorithm. After testing and verifying these algorithms, the best algorithm was selected and utilized to mining hidden product quality knowledge. Then, this study incorporated the discovered product quality knowledge into RFID system in the RHHDM framework. The RFID system could enable the quality management processes to be reengineered. This study actually applied the proposed RHHDM framework to the enterprise in practice in order to test and verify the applicability and effectiveness of the framework. The results of this study were that the proposed RHHDM framework could tightly integrate data mining, RFID system, and quality management processes. The RHHDM framework was applied to improve the traceability and visibility of product quality and make better decisions by the enterprise. According to the experimental results and analyses, the RHHDM framework could actually help the enterprise enhance customer satisfaction, save the cost, enhance the efficiency of the internal process, and enable the organization learning and growth. The proposed RHHDM framework could be also applied to production management, warehouse management, and product recommendation for sale, in order to help enterprises enhance competitiveness.
Yu, Ting-Yi, i 尤婷藝. "Using A Hybrid Meta-evolutionary Algorithm for Mining Classification Rules Through Microarray Data". Thesis, 2010. http://ndltd.ncl.edu.tw/handle/5ns46j.
Pełny tekst źródła國立虎尾科技大學
資訊管理研究所
98
With the rapid development of information technology, microarray data is an important field of study for cancer research. However, microarray data is with high dimensional attributes and small sample size resulting in lengthy computation time and low classification accuracy. Due to gene microarray data classification issues, how to get more accurate prediction results with better quality becomes an important area of research. This thesis has proposed a hybrid evolutionary algorithm which combines a genetic algorithm and binary particle swarm optimization with fuzzy discriminate function. The proposed method is used to estimate the fitness value for classification, significant variables extraction, and parameters of fuzzy membership function in the meanwhile. Through the adjustment of the dimension of microarray data and the choice of membership function, fewer significantly characteristic attributes can reach high classification accuracy. Fuzzy rules can also be observed through data attributes and the relationship between categories. To reduce the vast computation time for classification process, this study integrates grid computing technology in the proposed approach. The experimental results show our proposed method can achieve higher classification accuracy and effectively reduces the computation time.
(7054517), Syed Zahid Hassan. "A novel hybrid data mining approach for knowledge extraction and classification in medical databases". Thesis, 2008. https://figshare.com/articles/thesis/A_novel_hybrid_data_mining_approach_for_knowledge_extraction_and_classification_in_medical_databases/21443082.
Pełny tekst źródłaOver the past several years, there has been an explosion in the amount of medical data generated and subsequently collected in medical domain. Data mining techniques have been used extensively in mining the medical data. Obtaining high quality data mining results is very challenging because of the inconsistency of the results of different data mining algorithms and noise in the medical data.
This thesis presents a novel hybrid data mining approach for knowledge extraction and classification in medical databases. The proposed approach is formulated to cluster extracted features from medical databases into soft clusters using unsupervised learning strategies and fuse the decisions using serial and parallel data fusion techniques. The idea is to observe associations in the features and fuse the decisions made by learning algorithms to find the strong clusters which can make impact on overall classification accuracy. The novel techniques such as serial cascaded data fusion, parallel majority-voting based neural data fusion and parallel neural network based data fusion are proposed that allow integration of various clustering algorithms for hybrid data mining approach.
The proposed approach has been implemented and evaluated on the benchmark databases such as Digital Database for Screening Mammograms, Wisconsin Breast Cancer, Pima Indian Diabetics and ECG Heart Arrhythmia.
A comparative performance analysis of the proposed hybrid data mining approach with other existing approaches for knowledge extraction and classification is presented. The experimental results demonstrate the effectiveness of the proposed approach in terms of improved classification accuracy on benchmark medical databases.
Guo, Mu-Liang, i 郭木良. "A Hybrid System Integrating Data Mining and Artificial Intelligence Approaches for Stock Price Prediction". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/30048291361050573695.
Pełny tekst źródła國立中正大學
財務金融研究所
102
In this study, we develop a new hybrid stock prediction system by integrating data mining and artificial intelligence techniques. Different from other studies, this study proposes a system that does not predict stock price using these techniques directly. We posit that technical indicators are not always effective. Each indicator is affected by other indicators and fundamentalist factors. Consequently, the proposed system integrates these two techniques to optimize their advantages based on technical and fundamental indicators. We conduct two experiments to examine the prediction ability of the proposed system across different industries. The results reveal that the proposed system is capable of determining the right timing for an investor to avoid extra loss, increase profitability, and decrease trading cost.
HUANG, TING-XUAN, i 黃婷萱. "A Hybrid Data Mining Model for Analyzing the Association between Diabetes and Breast Cancer". Thesis, 2016. http://ndltd.ncl.edu.tw/handle/59205764816125214266.
Pełny tekst źródła輔仁大學
企業管理學系管理學碩士班
104
Diabetes is a chronic disease which cannot be cured by medical technology nowadays, it death population created by complications of diabetes increasing year by year, and breast cancer brings huge medical expenses, and it becomes the burden of the National Health Insurance. The relevance between diabetes and cancer is a well-known issue in recent years, among all the cancer, the incidence of breast cancer is the highest in Taiwanese female. Therefore, the purpose of this study is applying data mining techniques to retrospective cohort study the association between diabetes and breast cancer. The proposed disease risk factor analysis model combines under sampling based on clustering (SBC), and classification and regression trees (CART) to construct a disease prediction model. Analysis the databases of national health insurance to explore disease risk factors affecting diabetic patients without breast cancer start dialysis treatment in next two years. Experimental results showed that female patients suffers “diabetes neuropathy” or “Diabetes mellitus with peripheral circulatory disorder”, that it prevalence rate and incidence rate was significantly higher. With this model, it can reduce the effect of big data's class imbalance problem and finding the potential disease risk. The proposed model also can use in different disease and alleviate the burden of National Health Insurance.
Chen, Chien-Wei, i 陳建維. "Development of Real Time Production Control System in FAB By Hybrid Data Mining Approach". Thesis, 2007. http://ndltd.ncl.edu.tw/handle/60626635834370788788.
Pełny tekst źródła華梵大學
資訊管理學系碩士班
96
Using machine learning-based real time dispatching rule selection mechanism to develop knowledge bases (KBs) for production control system (PCS) has shown encouraging results in recent research. However, there is still little research focusing on employed real time dispatching rule selection mechanism to improve production performance in semiconductor wafer fabri-cation factories PCS. Moreover, due to short product life cycles, most actual FABs produce multiple products and the product mix changes from time to time. All of earlier work of machine learning-based real time PCS must add new training sample and regenerate KBs periodically. Hence, the machine learning-based PCS is confronted with training data overflow problem and increase dispatching rule selection mechanism KB building time and is not suited for on-line production control. To resolve discussed above problems, the PCS KBs are developed by two phase: SVM-based KB category selection mechanism and SVM-based real time dispatching rule classifier. Therefore, this investigate develops hybrid data mining-based approach includes overall knowledge discovery in data-bases (KDD) processes that comprise six key components: simulation-based training example generation mechanism, data normalization mechanism, GA-based feature selection through SVM classifier, build KB category by two-level self-organizing map (SOM) approach, SVM-based KB category selec-tion mechanism and SVM-based real time dispatching rule classifier to achieve these research goals. At the KB category selection phase, is applied by two-level SOM ap-proach clustering of the unclassified training data such that data with a similar characteristic which is defined as system attribute fall into the same class. Us-ing SVM learning algorithm learn the whole set of training examples with KB class label to construct KB category selection mechanism. The proposed SVM classifier using the hybrid data mining-based approach yields a better system performance than those obtained with a classical machine learning-based dis-patching rule selection mechanism and heuristic individual dispatching rules under various performance criteria over a long period in FABs.
Li, Jie-Ruei, i 李睿傑. "An Intelligent Vehicular Maintenance and Replacement System in Distribution Services: A Hybrid Data Mining Technique". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/16632688345502903558.
Pełny tekst źródła輔仁大學
資訊管理學系
97
As e-commerce has grown exponentially, the business of the distribution service is also growing up and expanding quickly for recent years. Namely, e-commerce not only changes customers’ shopping behaviors to bring new opportunities to the B2C marketspace. Nevertheless, from the perspective of merchant-side, the maintenance, repair and operations (MRO) fees of vehicles in distribution service is also increasingly dramatically. In this project, we develop an intelligent maintenance and replacement system to help the manager and technicians conduct preventive maintenance and replacement for vehicles. Practically, we employ association rule mining and sequential pattern mining methods to analyze the relationships and the priorities among vehicles’ components to execute preventive maintenance. Furthermore, we employ C4.5 algorithm of decision tree to predict the vehicles in dangerous based on the historical maintenance lists. Consequently, it can help the manager and technicians to make decision on either repairing the vehicles or sold out them. For taking the advantage of Web-based Platform, we adopt the .Net technique to develop the intelligent system based on the proposed methods; therefore, employees can access the services anytime-anywhere via the Internet. Finally, we will realize the system in the distributions service to evaluate the accuracy and feasibility of the proposed model and system.
Chauhan, Ajay Singh. "Financial statement fraud detection Model based on Hybrid data mining methods: Proposing an optimized Detection model". Thesis, 2019. http://dspace.dtu.ac.in:8080/jspui/handle/repository/17200.
Pełny tekst źródłaRodic, Daniel. "A Hybrid heuristic-exhaustive search approach for rule extraction". Diss., 2001. http://hdl.handle.net/2263/25095.
Pełny tekst źródłaDissertation (MSc)--University of Pretoria, 2007.
Computer Science
unrestricted
Liu, Minhui. "Multivariate nonnormal regression models, information complexity, and genetic algorithms a three way hybrid for intelligent data mining /". 2006. http://etd.utk.edu/2006/LiuMinhui.pdf.
Pełny tekst źródłaYang, Chun-Yi, i 楊竣壹. "A Hybrid of Data Mining and Statistical Analysis Approach on Association between Pulmonary Tuberculosis and Lung Cancer". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/45849810654231649880.
Pełny tekst źródła國立臺灣科技大學
工業管理系
102
Background and objective: Being as a global infectious disease and top 10 most fatal cancers in Taiwan, it is important to acquire the clinical pathology of tuberculosis (TB) and lung cancer. This study explored the association of tuberculosis and lung cancer with other comorbidities and investigated whether any featured attribute could be critical factor in influence of the risk of lung cancer among TB patients by conducting a hybrid data mining and statistical approach. Methods: Study objects were be identified from the NHIRD with diagnosis of tuberculosis between 2000 and 2002 and tracked to 2011. In a cohort of 6,137 patients with tuberculosis and aged over 20 years old, 1,459 patients were divided into middle age group and 3,527 patients were identified as elder age group based on the result of decision tree. Association rule, Cox regression and survival analysis were used for comparison between groups. Results: The incident rate of lung cancer is approximately 4-fold higher in the middle age group than the elder age group (8.45 versus 39.03 per 10,000 person-years). COPD increases the risk of lung cancer in both middle age group (6.64; 95% CI, 2.17-20.33) and elder age group (2.22; 95% CI, 1.52-3.23). The patients in middle age generally have more chance to be free from lung cancer compared to those with elder age in survival analysis (98.9% versus 95.8%, log-rank p < 0.0001). Conclusions: This study provides a comprehensive analysis on impacts of age with comorbidities in risk of lung cancer among tuberculosis patients. The risk may increase further on patients in middle age group than those with elder age.
(9028061), Chenxi Xiong. "HYBRID FEATURE SELECTION IN NETWORK INTRUSION DETECTION USING DECISION TREE". Thesis, 2020.
Znajdź pełny tekst źródłaWang, Shu-Chao, i 王淑昭. "The Factors Affecting Academic Achievement for the 5th and 6th Grade Elementary School Students by Hybrid Data Mining Approach". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/75460218666045377079.
Pełny tekst źródła華梵大學
資訊管理學系碩士班
96
A total number of 485 5th and 6th grade students of effective samples were all from an elementary school in Taipei country during 2004 to 2006. To resolve student academic achievement problem, this study develops a hybrid Genetic Algorithm/Decision Tree (i.e., GA/DT) approach. Then, the proposed GA/DT approach compares with DT, factors analysis combining DT, and correlation combining DT. The study results indicated that the key attributes of 5th and 6th grade elementary school students academic achievement include mother’s age, father’s academic history, parenting methods, gender, family’s atmosphere, relations of parents, living environment, rank of family, members of family, live with someone, preschool education, inhabit the situation, stature and so on. Among them, mother's age is the most important attribute for the 5th and 6th grade elementary school students in academic achievement. A student’s academic achievement is excellent, if his mother's age over 36 and his father's academic credentials is for the above university. A student’s academic achievement is poor, if his mother's age over 41, his father's academic credentials are high schools or vocational school and his father of parenting style for the others. The prediction system has higher accuracy by hybrid GA/DT algorithm and the average accuracy amount to 69.4%, better than DT 61.2%, factor analysis combining DT 59.1% and correlation combining DT 54.5%.
Chen, Li-Fei, i 陳麗妃. "A Hybrid Data Mining Framework with Rough Set Theory, Support Vector Machine, and Decision Tree and its Case Studies". Thesis, 2007. http://ndltd.ncl.edu.tw/handle/30869955008789719497.
Pełny tekst źródła國立清華大學
工業工程與工程管理學系
95
Support vector machine (SVM), rough set theory (RST) and decision tree (DT) are methodologies applied to various data mining problems, especially for classification prediction tasks. Studies have shown the ability of RST for feature selection while SVM and DT are significantly on their predictive power. This research aims to integrate the advantages of SVM, RST and DT approaches to develop a hybrid framework to enhance the quality of class prediction as well as rule generation. In addition to build up a classification model with acceptable accuracy, the capability to explain and explore how the decision made with simple, understandable and useful rules is a critical issue for human resource management. DT and RST can generate such rules, however, SVM can not offer such function. The major concept consists of four main stages. The first stage is to select most important attributes. RST is applied to eliminate the redundant and irrelative attributes without loss of any information about classification. The second stage is to reduce noisy objects, which can be accomplished by cross validation through using SVM. If the new data set would induce data imbalance problem, the rules generated by RST would be used to adjust the class distribution (stage 3). Through the stages described above, a data set with fewer dimensions and higher degree of purity could be screened out with similar class distribution and is used to generate rules by using DT which complete the last stage. In addition, the decisions concern with personnel selection prediction always involve handling data with highly dimensions, uncertainty and complexity, which cause traditional statistical methods suffering from low power of test. For validation, real cases of personnel selection of two high-tech companies containing direct and indirect labors in Hsinchu, Taiwan are studied using the proposed hybrid data mining framework. Implementation results show that the proposed approach is effective and has a better performance than that of traditional SVM, RST and DT.