Tesis sobre el tema "Decision Tree with CART algorithm"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Decision Tree with CART algorithm".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Hari, Vijaya. "Empirical Investigation of CART and Decision Tree Extraction from Neural Networks". Ohio University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1235676338.
Texto completoKonda, Ramesh. "Predicting Machining Rate in Non-Traditional Machining using Decision Tree Inductive Learning". NSUWorks, 2010. http://nsuworks.nova.edu/gscis_etd/199.
Texto completoFernandes, Fabiano Rodrigues. "Emprego de diferentes algoritmos de árvores de decisão na classificação da atividade celular in vitro para tratamentos de superfícies de titânio". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/165456.
Texto completoThe interest for the area of analysis and characterization of biomedical materials as the need for selecting the adequate material to be used increases. However, depending on the conditions to which materials are submitted, characterization may involve the evaluation of mechanical, electrical, optical, chemical and thermal properties besides bioactivity and immunogenicity. Literature review shows the application decision trees, using SimpleCart(CART) and J48 algorithms, to classify the dataset, which is generated from the results of scientific articles. Therefore the objective of this study was to identify surface characteristics that optimizes the cellular activity. Based on published articles, the effect of the surface treatment of titanium on the in vitro cells (MC3TE-E1 cells) was evaluated. It was found that applying SimpleCart algorithm gives better results than the J48. In this sense, the present study has the objective to apply the CHAID (Chi-square iteration automatic detection) algorithm and Exhaustive CHAID to the surveyed data, and compare the results obtained with the application of SimpleCart algorithm. The validation of the results showed that the Exhaustive CHAID obtained better results comparing to CHAID algorithm, obtaining 75.9 % of accurate estimation against 58.5%, respectively, while the standard error was 7.9% against 9.1%, respectively. Comparing the obtained results with SimpleCart(CART) results which had already been tested and presented in the literature, the results for accurate estimation was 34.5% and the standard error 8.8%. In relation to execution time found through the 22.000 registers, it showed that the algorithm Exhaustive CHAID presented the best times, with a gain of 0.02 seconds over the CHAID algorithm and 14.45 seconds over the SimpleCart(CART) algorithm.
Kassim, M. E. "Elliptical cost-sensitive decision tree algorithm (ECSDT)". Thesis, University of Salford, 2018. http://usir.salford.ac.uk/47191/.
Texto completoShi, Haijian. "Best-first Decision Tree Learning". The University of Waikato, 2007. http://hdl.handle.net/10289/2317.
Texto completoGirardini, Davide <1985>. "Efficient implementation of Treant: a robust decision tree learning algorithm". Master's Degree Thesis, Università Ca' Foscari Venezia, 2020. http://hdl.handle.net/10579/17423.
Texto completoTrivedi, Ankit P. "Decision tree-based machine learning algorithm for in-node vehicle classification". Thesis, California State University, Long Beach, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10196455.
Texto completoThis paper proposes an in-node microprocessor-based vehicle classification approach to analyze and determine the types of vehicles passing over a 3-axis magnetometer sensor. The approach for vehicle classification utilizes J48 classification algorithm implemented in Weka (a machine learning software suite). J48 is Quinlan's C4.5 algorithm, an extension of decision tree machine learning based on an ID3 algorithm. The decision tree model is generated from a set of features extracted from vehicles passing over the 3-axis sensor. The features are attributes provided with correct classifications to the J48 training algorithm to generate a decision tree model with varying degrees of classification rates based on cross-validation. Ideally, using fewer attributes to generate the model allows for the highest computational efficiency due to fewer features needed to be calculated while minimalizing the tree with fewer branches. The generated tree model can then be easily implemented using nested if-loops in any language on a multitude of microprocessors. Also, setting an adaptive baseline to negate the effects of the background magnetic field allows reuse of the same tree model in multiple environments. The result of the experiment shows that the vehicle classification system is effective and efficient.
Krook, Jonatan. "Predicting low airfares with time series features and a decision tree algorithm". Thesis, Uppsala universitet, Statistiska institutionen, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-353274.
Texto completoJeenanunta, Chawalit. "The Approach-dependent, Time-dependent, Label-constrained Shortest Path Problem and Enhancements for the CART Algorithm with Application to Transportation Systems". Diss., Virginia Tech, 2004. http://hdl.handle.net/10919/27773.
Texto completoPh. D.
Feychting, Sara. "Incredible tweets : Automated credibility analysis in Twitter feeds using an alternating decision tree algorithm". Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186711.
Texto completoDoubleday, Kevin. "Generation of Individualized Treatment Decision Tree Algorithm with Application to Randomized Control Trials and Electronic Medical Record Data". Thesis, The University of Arizona, 2016. http://hdl.handle.net/10150/613559.
Texto completoVANCE, DANNY W. "AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING". University of Cincinnati / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1162335608.
Texto completoSantos, Ernani Possato dos. "Análise de crédito com segmentação da carteira, modelos de análise discriminante, regressão logística e classification and regression trees (CART)". Universidade Presbiteriana Mackenzie, 2015. http://tede.mackenzie.br/jspui/handle/tede/970.
Texto completoThe credit claims to be one of the most important tools to trigger and move the economic wheel. Once it is well used it will bring benefits on a large scale to society; although if it is used without any balance it might bring loss to the banks, companies, to governments and also to the population. In relation to this context it becomes fundamental to evaluate models of credit capable of anticipating processses of default with an adequate degree of accuracy so as to avoid or at least to reduce the risk of credit. This study also aims to evaluate three credit risk models, being two parametric models, discriminating analysis and logistic regression, and one non-parametric, decision tree, aiming to check the accuracy of them, before and after the segmentation of such sample through the criteria of costumer s size. This research relates to an applied study about Industry BASE.
O crédito se configura em uma das mais importantes ferramentas para alavancar negócios e girar a roda da economia. Se bem utilizado, trará benefícios em larga escala à sociedade, porém, se utilizado sem equilíbrio, poderá trazer prejuízos, também em larga escala, a bancos, a empresas, aos governos e aos cidadãos. Em função deste contexto, é precípuo avaliar modelos de crédito capazes de prever, com grau adequado de acurácia, processos de default, a fim de se evitar ou, pelo menos, reduzir o risco de crédito. Este estudo tem como finalidade avaliar três modelos de análise do risco de crédito, sendo dois modelos paramétricos, análise discriminante e regressão logística, e um não-paramétrico, árvore de decisão, em que se avaliou a acurácia destes modelos, antes e após a segmentação da amostra desta pesquisa por meio do critério de porte dos clientes. Esta pesquisa se refere a um estudo aplicado sobre a Indústria BASE.
Gerdes, Mike. "Predictive Health Monitoring for Aircraft Systems using Decision Trees". Licentiate thesis, Linköpings universitet, Fluida och mekatroniska system, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-105843.
Texto completoJohansson, Viktor. "A sensor orientation and signal preprocessing study of a personal fall detection algorithm". Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-21375.
Texto completoVinnemeier, Christof David [Verfasser], Jürgen [Akademischer Betreuer] May, Uwe [Akademischer Betreuer] Groß y Tim [Akademischer Betreuer] Friede. "Establishment of a clinical algorithm for the diagnosis of P. falciparum malaria in children from an endemic area using a Classification and Regression Tree (CART) model / Christof David Vinnemeier. Gutachter: Uwe Groß ; Tim Friede. Betreuer: Jürgen May". Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2015. http://d-nb.info/1065882017/34.
Texto completoMcNamara, Nathan Patrick. "Using Decision Trees to Predict Intent to Use Passive Occupational Exoskeletons in Manufacturing Tasks". Ohio University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1605720844135027.
Texto completoAstapenko, D. "Automated system design optimisation". Thesis, Loughborough University, 2010. https://dspace.lboro.ac.uk/2134/6863.
Texto completoKhan, Kashif. "A distributed computing architecture to enable advances in field operations and management of distributed infrastructure". Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/a-distributed-computing-architecture-to-enable-advances-in-field-operations-and-management-of-distributed-infrastructure(a9181e99-adf3-47cb-93e1-89d267219e50).html.
Texto completoRodríguez, Elen Yanina Aguirre. "Técnicas de aprendizado de máquina para predição do custo da logística de transporte : uma aplicação em empresa do segmento de autopeças /". Guaratinguetá, 2020. http://hdl.handle.net/11449/192326.
Texto completoResumo: Em diferentes aspectos da vida cotidiana, o ser humano é forçado a escolher entre várias opções, esse processo é conhecido como tomada de decisão. No nível do negócio, a tomada de decisões desempenha um papel muito importante, porque dessas decisões depende o sucesso ou o fracasso das organizações. No entanto, em muitos casos, tomar decisões erradas pode gerar grandes custos. Desta forma, alguns dos problemas de tomada de decisão que um gerente enfrenta comumente são, por exemplo, a decisão para determinar um preço, a decisão de comprar ou fabricar, em problemas de logística, problemas de armazenamento, etc. Por outro lado, a coleta de dados tornou-se uma vantagem competitiva, pois pode ser utilizada para análise e extração de resultados significativos por meio da aplicação de diversas técnicas, como estatística, simulação, matemática, econometria e técnicas atuais, como aprendizagem de máquina para a criação de modelos preditivos. Além disso, há evidências na literatura de que a criação de modelos com técnicas de aprendizagem de máquina têm um impacto positivo na indústria e em diferentes áreas de pesquisa. Nesse contexto, o presente trabalho propõe o desenvolvimento de um modelo preditivo para tomada de decisão, usando as técnicas supervisionadas de aprendizado de máquina, e combinando o modelo gerado com as restrições pertencentes ao processo de otimização. O objetivo da proposta é treinar um modelo matemático com dados históricos de um processo decisório e obter os predit... (Resumo completo, clicar acesso eletrônico abaixo)
Mestre
Odeh, Khaled. "Nouveaux algorithmes pour le traitement probabiliste et logique des arbres de défaillance". Compiègne, 1995. http://www.theses.fr/1995COMPD846.
Texto completoPazúriková, Jana. "Adaptivní model pro simulaci znečištění ovzduší". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236487.
Texto completoCazzolato, Mirela Teixeira. "Classificação de data streams utilizando árvore de decisão estatística e a teoria dos fractais na análise evolutiva dos dados". Universidade Federal de São Carlos, 2014. https://repositorio.ufscar.br/handle/ufscar/565.
Texto completoFinanciadora de Estudos e Projetos
A data stream is generated in a fast way, continuously, ordered, and in large quantities. To process data streams there must be considered, among others factors, the limited use of memory, the need of real-time processing, the accuracy of the results and the concept drift (which occurs when there is a change in the concept of the data being analyzed). Decision tree is a popular form of representation of the classifier, that is intuitive and fast to build, generally obtaining high accuracy. The techniques of incremental decision trees present in the literature generally have high computational costs to construct and update the model, especially regarding the calculation to split the decision nodes. The existent methods have a conservative characteristic to deal with limited amounts of data, tending to improve their results as the number of examples increases. Another problem is that many real-world applications generate data with noise, and the existing techniques have a low tolerance to these events. This work aims to develop decision tree methods for data streams, that supply the deficiencies of the current state of the art. In addition, another objective is to develop a technique to detect concept drift using the fractal theory. This functionality should indicate when there is a need to correct the model, allowing the adequate description of most recent events. To achieve the objectives, three decision tree algorithms were developed: StARMiner Tree, Automatic StARMiner Tree, and Information Gain StARMiner Tree. These algorithms use a statistical method as heuristic to split the nodes, which is not dependent on the number of examples and is fast. In the experiments the algorithms achieved high accuracy, also showing a tolerant behavior in the classification of noisy data. Finally, a drift detection method was proposed to detect changes in the data distribution, based on the fractal theory. The method, called Fractal Detection Method, detects significant changes on the data distribution, causing the model to be updated when it does not describe the data (becoming obsolete). The method achieved good results in the classification of data containing concept drift, proving to be suitable for evolutionary analysis of data.
Um data stream e gerado de forma rápida, contínua, ordenada e em grande quantidade. Para o processamento de data streams deve-se considerar, dentre outros fatores, o uso limitado de memoria, a necessidade de processamento em tempo real, a precisão dos resultados e o concept drift (que ocorre quando há uma mudança no conceito dos dados que estão sendo analisados). À arvore de decisão e uma popular forma de representação do modelo classificador, intuitiva, e rápida de construir, geralmente possuindo alta acurada. Às técnicas de arvores de decisão incrementais presentes na literatura geralmente apresentam um alto custo computacional para a construção e atualização do modelo, principalmente no que se refere ao calculo para a decisão de divisão dos nós. Os métodos existentes possuem uma característica conservadora para lidar com quantidades de dados limitadas, tendendo a melhorar seus resultados conforme o número de exemplos aumenta. Outro problema e a geração dos dados com ruídos por muitas aplicações reais, pois as técnicas existentes possuem baixa tolerância a essas ocorrências. Este trabalho tem como objetivo o desenvolvimento de métodos de arvores de decisão para data streams, que suprem as deficiências do atual estado da arte. Além disso, outro objetivo deste projeto e o desenvolvimento de uma funcionalidade para detecção de concept drift utilizando a teoria dos fractais, corrigindo o modelo sempre que necessário, possibilitando a descrição correta dos acontecimentos mais recentes dos dados. Para atingir os objetivos foram desenvolvidos três algoritmos de arvore de decisão: o StÀRMiner Tree, o Àutomatic StÀRMiner Tree, e o Information Gain StÀR-Miner Tree. Esses algoritmos utilizam um método estatístico como heurística de divisão de nós, que não é dependente do numero de exemplos lidos e que e rápida. Os algoritmos obtiveram alta acurácia nos experimentos realizados, mostrando também um comportamento tolerante na classificação de dados ruidosos. Finalmente, foi proposto um método para a detecção de mudanças no comportamento dos dados baseado na teoria dos fractais, o Fractal Drift Detection Method. Ele detecta mudanças significativas na distribuicao dos dados, fazendo com que o modelo seja atualizado sempre que o mesmo não descrever os dados atuais (se tornar obsoleto). O método obteve bons resultados na classificação de dados contendo concept drift, mostrando ser adequado para a análise evolutiva dos dados.
Bridgstock, Ruth Sarah. "Success in the protean career : a predictive study of professional artists and tertiary arts graduates". Thesis, Queensland University of Technology, 2007. https://eprints.qut.edu.au/16575/1/Ruth_Bridgstock_Thesis.pdf.
Texto completoBridgstock, Ruth Sarah. "Success in the protean career : a predictive study of professional artists and tertiary arts graduates". Queensland University of Technology, 2007. http://eprints.qut.edu.au/16575/.
Texto completoCagnini, Henry Emanuel Leal. "Estimation of distribution algorithms for clustering and classification". Pontif?cia Universidade Cat?lica do Rio Grande do Sul, 2017. http://tede2.pucrs.br/tede2/handle/tede/7384.
Texto completoMade available in DSpace on 2017-06-29T11:51:00Z (GMT). No. of bitstreams: 1 DIS_HENRY_EMANUEL_LEAL_CAGNINI_COMPLETO.pdf: 3650909 bytes, checksum: 55d52061a10460875dba677a9812fe9c (MD5) Previous issue date: 2017-03-20
Extrair informa??es relevantes a partir de dados n?o ? uma tarefa f?cil. Tais dados podem vir a partir de lotes ou em fluxos cont?nuos, podem ser completos ou possuir partes faltantes, podem ser duplicados, e tamb?m podem ser ruidosos. Ademais, existem diversos algoritmos que realizam tarefas de minera??o de dados e, segundo o teorema do "Almo?o Gr?tis", n?o existe apenas um algoritmo que venha a solucionar satisfatoriamente todos os poss?veis problemas. Como um obst?culo final, algoritmos geralmente necessitam que hiper-par?metros sejam definidos, o que n?o surpreendentemente demanda um m?nimo de conhecimento sobre o dom?nio da aplica??o para que tais par?metros sejam corretamente definidos. J? que v?rios algoritmos tradicionais empregam estrat?gias de busca local gulosas, realizar um ajuste fino sobre estes hiper-par?metros se torna uma etapa crucial a fim de obter modelos preditivos de qualidade superior. Por outro lado, Algoritmos de Estimativa de Distribui??o realizam uma busca global, geralmente mais eficiente que realizar uma buscam exaustiva sobre todas as poss?veis solu??es para um determinado problema. Valendo-se de uma fun??o de aptid?o, algoritmos de estimativa de distribui??o ir?o iterativamente procurar por melhores solu??es durante seu processo evolutivo. Baseado nos benef?cios que o emprego de algoritmos de estimativa de distribui??o podem oferecer para as tarefas de agrupamento e indu??o de ?rvores de decis?o, duas tarefas de minera??o de dados consideradas NP-dif?cil e NP-dif?cil/completo respectivamente, este trabalho visa desenvolver novos algoritmos de estimativa de distribui??o a fim de obter melhores resultados em rela??o a m?todos tradicionais que empregam estrat?gias de busca local gulosas, e tamb?m sobre outros algoritmos evolutivos.
Extracting meaningful information from data is not an easy task. Data can come in batches or through a continuous stream, and can be incomplete or complete, duplicated, or noisy. Moreover, there are several algorithms to perform data mining tasks, and the no-free lunch theorem states that there is not a single best algorithm for all problems. As a final obstacle, algorithms usually require hyperparameters to be set in order to operate, which not surprisingly often demand a minimum knowledge of the application domain to be fine-tuned. Since many traditional data mining algorithms employ a greedy local search strategy, fine-tuning is a crucial step towards achieving better predictive models. On the other hand, Estimation of Distribution Algorithms perform a global search, which often is more efficient than performing a wide search through the set of possible parameters. By using a quality function, estimation of distribution algorithms will iteratively seek better solutions throughout its evolutionary process. Based on the benefits that estimation of distribution algorithms may offer to clustering and decision tree-induction, two data mining tasks considered to be NP-hard and NPhard/ complete, respectively, this works aims at developing novel algorithms in order to obtain better results than traditional, greedy algorithms and baseline evolutionary approaches.
Baker, Peter John. "Applied Bayesian modelling in genetics". Thesis, Queensland University of Technology, 2001.
Buscar texto completoCovi, Patrick. "Multi-hazard analysis of steel structures subjected to fire following earthquake". Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/313383.
Texto completoCovi, Patrick. "Multi-hazard analysis of steel structures subjected to fire following earthquake". Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/313383.
Texto completoJuozenaite, Ineta. "Application of machine learning techniques for solving real world business problems : the case study - target marketing of insurance policies". Master's thesis, 2018. http://hdl.handle.net/10362/32410.
Texto completoThe concept of machine learning has been around for decades, but now it is becoming more and more popular not only in the business, but everywhere else as well. It is because of increased amount of data, cheaper data storage, more powerful and affordable computational processing. The complexity of business environment leads companies to use data-driven decision making to work more efficiently. The most common machine learning methods, like Logistic Regression, Decision Tree, Artificial Neural Network and Support Vector Machine, with their applications are reviewed in this work. Insurance industry has one of the most competitive business environment and as a result, the use of machine learning techniques is growing in this industry. In this work, above mentioned machine learning methods are used to build predictive model for target marketing campaign of caravan insurance policies to achieve greater profitability. Information Gain and Chi-squared metrics, Regression Stepwise, R package “Boruta”, Spearman correlation analysis, distribution graphs by target variable, as well as basic statistics of all variables are used for feature selection. To solve this real-world business problem, the best final chosen predictive model is Multilayer Perceptron with backpropagation learning algorithm with 1 hidden layer and 12 hidden neurons.
Chiu, Chun-Chieh y 邱俊傑. "CUDT: A CUDA Based Decision Tree Algorithm". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/88189185018035112843.
Texto completo國立交通大學
資訊科學與工程研究所
99
Classification is an important issue both in Machine Learning and Data Mining. Decision tree is one of the famous classification models. In the reality case, the dimension of data is high and the data size is huge. Building a decision in large data base cost much time in computation. It is a computationally expensive problem. GPU is a special design processor of graphic. The highly parallel features of graphic processing made today’s GPU architecture. GPGPU means use GPU to solve non-graphic problems which need amounts of computation power. Since the high performance and capacity/price ratio, many researches use GPU to process lots computation. Compute Unified Device Architecture (CUDA) is a GPGPU solution provided by NVIDIA. This paper provides a new parallel decision tree algorithm base on CUDA. The algorithm parallel computes building phase of decision tree. In our system, CPU is responsible for flow control and GPU is responsible for computation. We compare our system to the Weka-j48 algorithm. The result shows out system is 6~5x times faster than Weka-j48. Compare with SPRINT on large data set, our CUDT has about 18 times speedup.
Lai, Jian-Cheng y 賴建丞. "Fast Quad-Tree Depth Decision Algorithm for HEVC Coding Tree Block". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/39ucm4.
Texto completo國立虎尾科技大學
資訊工程研究所
102
High Efficiency Video Coding (HEVC) is recently developed for ultra high definition video compression technique, which provides a higher compression ratio and throughput compared with previously video compression standard H.264/AVC. Therefore, this technique is widely used to limited bandwidth network transmission and confined storage space. In order to obtain the higher compression ratio and maintain video quality, which provides variable block partition and mode prediction for HEVC encoder. If each block is computed during the mode decision process, a lot of encoding time is consumed. It makes limiting the applicability in real time for HEVC. Hence, there are many fast algorithms proposed to eliminate the block partition or mode prediction. In natural videos, the neighbor blocks have high correlation with current block, by which the reference block method is studied to terminate or eliminate the block or mode prediction. This method uses the lower computation of mode reduction to obtain a best compression ratio and time saving. Therefore, that is widely proposed for HEVC fast algorithm. On the other hand, the non-reference method has been proposed by extracting the feature of video frames. But the non-reference method predict the terminated condition. This thesis, proposes two quad-tree depth decision methods : one is the reference method and the other one non-reference method for depth-correlation and edge strength detection method, respectively. In reference block method, we find the correlation of up to 90% correlation with the co-located coding tree block (CTB) in the previous frame. Therefore, we use the co-located CTB depth information to limit the depth partition of CTB. Different from the previously proposed method, the proposed method adopts the extension of partition depth by one level. But it is poor prediction in fast moving object sequence or change scene. The fast moving and changing scenes are lower correlation between frames. Based on aforementioned disadvantage, the edge strength detection method is proposed to detect the structure variation of CTB to predict the encoded depth. Since this method does not require the reference to neighbor block, a better prediction with variation video sequence can be obtained. But it makes the poor prediction for unobvious edge video. For example, in dark videos, the edge are not obvious and the proposed algorithm makes the poor prediction of depth level. Finally, the proposed fast methods are implemented in HM 10.1 model to demonstrate the efficiency of our algorithm. The proposed edge density detection method can obtain 23.1% of time savings with BD-bitrate close to 0.28% on average and depth-correlation method can provide about 21.1% of time savings and BD-bitrate increase of 0.17% on average.
蔡智政. "Causing factor analysis of low-yield wafer using CART decision tree and data visualization". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/86362726149609751227.
Texto completo元智大學
工業工程與管理學系
90
Everyday, there are many production data in the corporation. It is difficult that get any information by analyzing production data rapidly in the situation, which fill up question, variable and competition everywhere. The people who manages the production can’t read and analysis the data which is large amount and variable immediately. Yield analysis is critical for IC manufacturing since yield is directly related to production cast and competitively in the market. To monitor the manufacturing process and product quality, manufacturing data are automatically collected during wafer fabrication. Engineers use these data to select possible causing factors of low-yield wafer and employ statistical techniques (e.g. design of experiment) to verity the hypothesis. This approach, however, is difficult due to the large amount of parameters (from hundreds to thousands) and the complicated interactions among them. This research employs CART decision tree to help engineers select the possible causing factors if low-yield wafers. The input data is the lot in-process control (LPC) data, which are recorded by metrology machine during wafer fabrication. In addition to the Gini index, two indexes are developed for tree node splitting. Decision trees generated using these tree node splitting criteria provides engineers useful rules to analysis the causing factors of low-yield wafers.
Hsieh, Cheng-Hao y 謝正豪. "An Approach with Bat Algorithm and Decision Tree". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/38211713696274285313.
Texto completo華梵大學
資訊管理學系碩士班
102
The amount of information is rapidly increasing. Data mining is widely used. Decision tree can process data and provide the rules of tree structure, so that decision-maker can rapidly obtain information behind the hidden data. Therefore, decision tree can be used to solve problems from other domains. It should set the parameters before using decision tree, because how to properly set the parameters will affect the results. It is an important issue to adjust the parameters of decision tree. The different problems have different best parameters of decision tree. It spends a lot of time to adjust the parameters manually. Therefore, the thesis uses bat algorithm to adjust the parameters of decision tree. It can ameliorate the original accuracy of decision tree and find the best combination of parameters. After combining the algorithm, it will find the best combination of parameters for different datasets and generate the results. To compare the results with decision tree and support vector machine, the proposed algorithm can promote the accuracy of the original decision tree and has better results than support vector machine. Therefore, the bat algorithm can find the most appropriate parameters of decision tree in different problems.
Hsu, Wen-Pao y 許文寶. "A Study on Market Segmentation of Information Product Marketing Channels Using the CART Decision Tree". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/63308488403561060992.
Texto completo淡江大學
企業管理學系碩士在職專班
96
As Taiwan enters the era of being an information society, marketing channels for information products on this densely populated island are gradually changing. In recent years, conventional marketing channels for information products — which mainly include localized distribution channels and centralized shopping districts — have been replaced by 3C (computers, communication products and consumer electronics) chain stores. However, customer segments in the market of information products are greatly diversified. Products and services generally required by medium-size businesses and group corporations, for example, are not available in these 3C chain stores. In other words, information products for businesses are sold through particular channels, which in Taiwan are distributed in metropolitan areas. Therefore, brand companies of information products not only have to exercise the power of their brand value, but should also pay attention to the management of channel value to survive heated competition from peers and relevant industries. This thesis has the following three objectives: (1) to integrate the distributor databases of brand companies and, using the data mining technique and statistical analysis tools, group the distributors according to their values; (2) to mine the procurement characteristics of local distributors, develop potential distributors and confirm their values according to the theory of market segmentation and the results of decision tree analysis; and (3) to study how brand companies can achieve resource centralization, maximization of efficiency, and discuss differentiated marketing strategies and the enhancement of competitiveness in channel management for distributors of different values according to the localized characteristics. In this study, the data mining technique is applied to case companies for validation. Customers are divided into groups by first determining the RFM (Recency, Frequency, Monetary value) model attributes and then performing a K-means cluster analysis, so that the grouping results match the localized characteristics and the RFM classification variables. The attributes and characteristics of target distributors are then identified according to the classification results obtained by using the CART (Classification and Regression Tree) technique. After analyzing data obtained from the industry and the validation results, the differences among customer types are analyzed under two themes: “analysis of customer value” and “analysis of customer classification characteristics.” Finally, appropriate marketing strategies are proposed for the respective case companies with reference to the characteristics of different customer types.
Tsai, Yu-Ju y 蔡育儒. "Paralleled CHAID Decision Tree Algorithm with Big-Data Capability". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/08393327566226485549.
Texto completo淡江大學
統計學系碩士班
102
As technology advances, the era of Big-Data has finally arrived. As the amount of data increases , the improvement of computing speed becomes an important development technology. If data training and analysis time are reduced, we could make the prediction or decision much earlier then expected. As a result, parallel computation is one of the methods which can reduce the analysis time. In this paper, we rewrite the CHAID decision tree algorithm for parallel computation and Big-Data capability. Our simulation results show that, when the CPU has more than one kernel, the computation time of our improved CHAID tree is significantly reduced. When we have a huge amount of data, the difference of computation times is even more significant.
Lanning, James Michael. "A kernelized genetic algorithm decision tree with information criteria". 2008. http://etd.utk.edu/2008/August2008Dissertations/LanningJamesMichael.pdf.
Texto completoTu, Tsung-kai y 涂宗楷. "Analysis of Optimized Operation of Water Chiller by Using Data Envelopment Analysis and CART Decision Tree". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/fd39g8.
Texto completo國立臺北科技大學
能源與冷凍空調工程系碩士班
105
With the rapid development of information technology, how to effectively and correctly analyze the massive data has become an important issue. Data mining for the use of a database analysis method, from a number of data extracted from the implied, the past is not known, credible and effective knowledge. CART algorithm analysis has been successfully applied in many fields such as data exploration, text classification, image recognition, biological information, and etc., can use the CART classification model to achieve a very prominent effect. Therefore, this study first analyzes the running data recorded by the central monitoring system of the ice water main equipment of the chiller in Hsinchu. Then uses the data envelopment analysis method as the tool to discriminate the operation point of the chiller. And use CART algorithm to analyze the influence of the specific operating parameters on the chiller efficiency. To find a relatively efficient operating point, and improve the efficiency of the inefficient operating point. As operational optimization of the objective reference, and confirmed that the CART classification tree model has a certain effect. The results calculated by DEA show that the average technical efficiency falls between 90-95%, indicating that the operating conditions of the air conditioning system are relatively effective for most of the time. Then by the CART generated classification tree, to sum up, the relatively effective mode of operation. According to the results, show that, you do not need any additional cost and without affecting the quality of the original air conditioning, save 1-3% of energy consumption. To achieve the purpose of energy saving and carbon reduction.
Shiau, Fang-Jr y 蕭方智. "Developing a Hierarchical Particle Swarm based Fuzzy Decision Tree Algorithm". Thesis, 2005. http://ndltd.ncl.edu.tw/handle/38520035342800327252.
Texto completo元智大學
工業工程與管理學系
93
Decision tree is one of most common techniques for classification problems in data mining. Recently, fuzzy set theory has been applied to decision tree construction to improve its performance. However, how to design flexile fuzzy membership functions for each attribute and how to reduce the total number of rules and improve the classification interpretability are two major concerns. To solve the problems, this research proposes a hieratical particle swarm optimization to develop a fuzzy decision tree algorithm (HPS-FDT). In this proposed HPS-FDT algorithm, all particles are encoded using a hieratical approach to improve the efficiency of solution search. The developed HPS-FDT builds a decision tree to achieve: (1) Maximize the classification accuracy, (2) Minimize the number of rules and (3) Minimize the number of attributes and membership functions. Through a serious of benchmark data validation, the proposed HPS-FDT algorithm shows the high performance for several classification problems. In addition, the proposed HPS-FDT algorithm is tested using a mutual fund dataset provided by an internet bank to show the real world implementation possiblility. With the results, managers can make a better marketing strategy for specific target customers.
Chang, Tien-Wei y 張添瑋. "Using Expert Decision Tree Algorithm in Computer Assisted Testing System". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/5qd4ng.
Texto completo國立嘉義大學
資訊工程學系研究所
97
Since different person has different learning method that suits him/her, the adaptive learning is hard to be reached through the existed single feedback assessment mechanism. This thesis integrated a set of expert decision tree algorithm to build up the regularity between learning type and teaching strategy. The regularity is embedded into the computer assisted system to set up new diversified teaching assessment system. Experts’ teaching experiments and KLSI Inventory are used to set up the rules of learning type knowledge database in the system. Based on the learning type knowledge database, the system will judge learners’ strength in each stage of the learning period, and divided the learners into four learning types. Then based on the response from teaching material and assessment result, the learning strategy suitable for different types of learners is analyzed. After revise the knowledge regularity correlation, the system will propose a suggestion of learning strategy for the learners to enhance learning effectiveness.
Guo, Jin-Jhong y 郭金忠. "Medical and Biological Datasets Using Decision Tree and Genetic Algorithm". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/16671173991554268590.
Texto completo國立雲林科技大學
工業工程與管理研究所碩士班
94
This study describes the application of decision tree and genetic algorithm (GA) to evolve useful subset of discriminatory features for medical and biological data. We have shown that a GA combined with decision tree performs well in extracting information about the relationship between biological variables. The result suggests that six feature set evolutions : Aspartate aminotransferase (GOT)、HBV Surface Ag (HBs-Ag)、Blood urea nitrogen (BUN)、Globulin (Glo)、White blood cell count(WBC)、Electrolyte Ca(Ca) are appropriate for our large dataset. Those feature sets represent the relationship between biological variables and provide useful information for clinical diagnosis.
Lin, Yi-Kun y 林義坤. "Apply Decision Tree Algorithm to the Analysis of Web Log". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/17894208644016211291.
Texto completo華梵大學
資訊管理學系碩士班
99
With the rapid development of Internet and network services increased, the network has imperceptibly become an important tool in our lives, which makes network security events be endless. In these events, the reasons of many attacked websites by hackers are most of web application programmers who have no awareness of network security. Then, vulnerabilities are existed in websites such as Cross Site Request Forgeries (CSRF), SQL Injection, Script Insertion, and Cross Site Scripting (XSS). The attacked websites would lead to serious damages of data leakage for personal information or confidential business information. From this viewpoint, it is important to find an effective security defense strategy for attacks. This research will simulate the environment for the attack of hacker on a virtual machine, and then implement their attack methods. To establish a dataset by the web log collected from the attacks and then use the algorithm of decision tree to analyze and determine whether a web page is subject to malicious attacks or not. There are 7 obtained rules from experiment results. They can be used to analyze web log and provide the decision supports for the network administrator. Keywords: SQL Injection, Cross Site Scripting, Web Log, Decision Tree Algorithm
Yu, Li y 游力. "Dynamic Ensemble Decision Tree Learning Algorithm for Network Traffic Classification". Thesis, 2016. http://ndltd.ncl.edu.tw/handle/cyqff7.
Texto completo國立交通大學
網路工程研究所
104
Network traffic classification has already been discussed for decades, which gives us the ability to monitor and detect the applications associated with network traffic. It becomes the essential step of network management and traffic engineering such as QoS control, abnormal detection and ISPs network planing. From the earliest approach, which is the port base classification, to the state of the art practice, which is the machine learning classification. Beside, most of the information technology research and advisory organizations have forecast that we are going to enter the era of big data. We would face the high volume, high velocity and high variety data. And machine learning approach traffic classification has satisfying accuracy with lower computing resources, which meet the requirement of high volume and high velocity of big data. However, most of machine learning based traffic classification researches assume the network environment is stable, which is not true. This assumption makes the classifiers unable to deal with highly variety data, since they do not have the countermeasure of the changes of network environment. In order to address the issue, we proposed the dynamic ensemble decision tree learning algorithm or EDT. Our EDT is able to dynamically update its predicting model without retraining whole model all over again. In the experiment, The testing data are collected in our experimental LTE network. Evaluation shows our algorithm can respond to the new application 24 times faster in average than the original C5.0 decision tree learning algorithm without losing more than 1.02% accuracy. The contribution of this thesis is we proposed a new model for decision tree, giving it the ability to dynamically adjust the model.
LIN, SU CHING y 蘇清霖. "Building a salary predictive analysis model using the decision tree algorithm". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/23503546290538682457.
Texto completo僑光科技大學
資訊科技研究所
101
With evolution of technology and vigorous development of Internet, people no longer just simply record information. Instead, they analyze such data to discover its potential usage for certain knowledge, or analyze the existing facts to produce result forecasting further information, which then contribute for society usage. In recent years, the emergence of the domestic job recruiter website utilizes the advantages of the Internet for job seekers, and provides convenience, interactive and richness of information, which constitutes a sharp contrast to the traditional print media. Therefore how to properly utilize huge amount of data from such website is significant important subject. This paper uses the curriculum vitae database of year 2010 offered by 1111 Job Bank; it contains two million applicants information, and with such amount of data, it is considerable and representative. Base on the education and work experience data that is provided by applicants on their curriculum vitae, using the decision tree algorithm in data mining technology, the salary predictive analysis model is constructed. Using the model, applicants can predict the salary range based upon their qualifications, work experiences and the condition of the recruitment advertisements. This information then can be used by applicants, when applying jobs, as a reference for choosing jobs and negotiating salary in the interview.
Wang, Yuang-Jang y 王元璋. "A fuzzy decision tree with fuzzy entropy turned by genetic algorithm". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/25559225344644267556.
Texto completo國立成功大學
資訊管理研究所
94
The decision tree is one of data mining methodologies. It generates rules via learning to provide back-end information to decision support systems (DSS) or executive information system (EIS) for decision making. Regular decision trees lack of ability to deal with uncertainty. Fuzzy decision trees handle uncertainty using linguistic variable with adjustable membership functions. The membership function of fuzzy decision tree can adapt itself to various situation for gaining decision accuracy. This study builds two kinds of classification model. The first type is based on genetic algorithms. It uses a real number type genetic algorithm and a fitness function, which is easy to evaluate. The second type uses fuzzy decision tree model based on fuzzy entropy of Janikow and entropy of Quinlan to perform algorithm. This model searches the best membership function of fuzzy decision trees using genetic algorithm. In the study, three linguistics and two linguistics are used to form the fuzzy decision tree depending on the problems encountered. UCI-ML is used as the research database. The study shows that both of the genetic algorithm and fuzzy decision tree have better rates of accuracy than those of Naïve Bayesian and C4.5 Based decision tree. The advantage of the proposed fuzzy decision tree is that the decision boundary can be adjusted to improve accuracy using domain knowledge of managers.
Chen, Ming-Shiang y 陳明祥. "Predicting Student Deviation Behavior By Hybrid Genetic Algorithm/Decision Tree Approach". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/75138559893452241385.
Texto completo華梵大學
資訊管理學系碩士班
94
Abstract Because of the changes of the times, the multiple-society, the pressure of school, family, and individual factor, etc., many students' deviant behaviors take place more often. Once students' deviant behaviors happen, it may be difficult to compensate the great impact on individual or group. This research by utilizing superior information technology that uses a hybrid Genetic algorithm (GA) & Decision trees (DT) approach, to predict if the student has a tendency toward deviance. The experimental results demonstrated:in 116 attributes, gender, whether there is go to school late is as well as has the conflict with his father etc are the criteria to distinct if the student has a tendency toward deviant behavior, Although there are partly affected facts under included in the asked items (attributes) of the questionnaire survey project, evaluated the averaged accurate rates of this research predicting whether students have a tendency toward deviance is still reaches 86%. If so, This research provides an new idea to educational counseling personnel, the method of forecasts the student deviation behavior tendency not only by the traditional statistics approach, but also the hybrid GA/DT-based knowledge learning mechanism similarly can provide more richer, clear and fine rule to forecast the student deviation behavior is what kind of type, as to the utilization for the correlation educational personnel, Besides, if can skillfully improve the latent influence factor in the rule about the student , simultaneously coordinate use the methods of counseling and guidance to prevent that from happening. In this way, we should be possible effectively to reduce the formation rate of the student deviation behavior, and help limited tutor manpower to effectively exert the approaches to guidance when they do confront over thousands of students, the counseling work still could display the good result. Keywords:Deviant behavior, Genetic algorithm, Decision trees
LAI, I.-CHIEN y 賴以建. "GENETIC ALGORITHM, NEURAL NETWORK AND DECISION TREE IN PRE-WARNING MODELS". Thesis, 2003. http://ndltd.ncl.edu.tw/handle/34939994824622402263.
Texto completo國立臺北大學
企業管理學系
91
In the past, the pre-warning models for Financial Crisis are usually established based on traditional statistical methods such as Discriminant Analysis. However, it is often questionable whether the financial data satisfies the assumptions of such models. Therefore, this study investigates the construction of pre-warning model through nonlinear methods such as Genetic Algorithm and Neural Network. In additional, since the reference value for the key indicator that influences business failure most cannot be extracted from the pre-warning model, this study starts with using Decision Tree technique to extract this reference value. Based upon this, the objectives of this thesis include the following: 1.Identify the chromosome that influences business failure most through Genetic Algorithm’s strong searching capability. 2.Construct financial pre-warning models from Neural Network and traditional Discriminant Analysis techniques, and evaluate their pre-warning performance by comparing the ability to predict business failure three years before its occurrence. 3.Extract the reference value and the key descriptive indicator that influences business failure most through Decision Tree technique, thus enabling the investing public and associated authority to constantly monitor the key financial factors. The main characteristic of the Genetic Algorithm used in this study is its massive parallel optimizing ability. The analyses on the actual data show that: 1.Identify the Genetic component (chromosome) that influences business failure most through Genetic Algorithm: after 500 generations, the optimal chromosome combinations are Operating Income Ratio, Sales per Share, Earnings before Interest/Equity, Net Present Value per Stock (A), Net Present Value per Stock (B), Retained Profit Ratio, Cash Flow Adequacy Ratio, Times Interest Earned, Fixed Asset Turnover Ratio, and Operating Expense Ratio. 2.By employing the key indicators obtained from Genetic Algorithm, both Neural Network model and Discriminant Analysis model can accurately predict business failure (on average, for three-year ago prediction, hit ratio: 0.9500 compared with 0.9055). The hit ratios for both models are the same (0.9667) for one-year ago prediction. However, the hit ratios for two- and three-year ago predictions are higher for Neural Network model (0.9500 and 0.9333 compared with 0.9166 and 0.8333). This indicates that Neural Network pre-warning model has higher probability to successfully predict business failure earlier. 3.The Decision Tree cannot effectively distinguish the samples of successful and failure business. The following results are observed. When the Retained Profit Ratio of a business is larger than 0.9931, the business failure rate is about 88%. When the Retained Profit Ratio is larger than 0.9931 and Operating Income Ratio is lower than 0.0098, the business failure rate is as high as 97.33%.
Sheif, Kuo-yeith y 謝國義. "An improved algorithm to reduce the computational complexity in decision tree generation". Thesis, 1998. http://ndltd.ncl.edu.tw/handle/33259286892116275891.
Texto completoFann, Wen-Chih y 范文誌. "Predicting mortality in patients with necrotizing fasciitis by using decision tree algorithm". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/51141977925238132643.
Texto completo臺北醫學大學
醫學資訊研究所
97
Objective: To identify simple admission clinical characteristics or laboratory tests to predict mortality but also differentiate between high and low mortality risk groups in patients with necrotizing fasciitis. Methods: This retrospective chart-review study included adult patients who were admitted to two hospitals through the emergency department with the discharge diagnosis of necrotizing fasciitis. Both the chi-square test (or the Mann-Whitney-U test) and C4.5 decision tree were utilized to analyze 23 variables among clinical characteristics and laboratory tests. The main outcome measure was in-hospital mortality. Results: 272 patients were included and the overall mortality rate was 17%. On univariate analysis, significant variables associated with mortality included liver cirrhosis, cancer, chronic kidney disease, adrenal insufficiency, hypotension, white blood cell (WBC) count, WBC band form, and hemoglobin. Three independent predictors of mortality - WBC count, WBC band form, and hypotension- were determined by means of the C4.5 decision tree. From these predictors, six decision rules were produced to classify patients with necrotizing fasciitis into high and low mortality risk groups. The accuracy of C4.5 decision tree with cross-validation was 84.2% (95% confidence interval, 80.3%-88.1%). Conclusions: By using routine blood pressure measurement and simple laboratory test, WBC count and differential, emergency physicians may rapidly identify patients with high mortality.
TSOU, YU-CHIEH y 鄒侑捷. "Application of Decision Tree Algorithm to Item Selection Strategies in Multistage Testing". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/e97vsd.
Texto completo輔仁大學
統計資訊學系應用統計碩士班
107
The purpose of this study is to combination decision tree model and test theory and applies to the multi-stage test. The result indicate that the classification correct rate and rooted mean square error of decision tree selection, Fisher information and KL information method are better than random selection. Under the one-parameter model hypothesis, the cost of decision tree method, FI information method and KL information method is similar when the student’s ability is under uniform distribution hypothesis. However, when the student’s ability is under normal distribution hypothesis, the cost of the decision tree item select strategy is greater than other methods. Under the assumption of two-parameter model, the cost of decision tree method can be reduced and make the test more economical.