Tesis: "Decision Tree with CART algorithm"

1

Hari, Vijaya. "Empirical Investigation of CART and Decision Tree Extraction from Neural Networks". Ohio University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1235676338.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Konda, Ramesh. "Predicting Machining Rate in Non-Traditional Machining using Decision Tree Inductive Learning". NSUWorks, 2010. http://nsuworks.nova.edu/gscis_etd/199.

Texto completo

Resumen

Wire Electrical Discharge Machining (WEDM) is a nontraditional machining process used for machining intricate shapes in high strength and temperature resistive (HSTR) materials. WEDM provides high accuracy, repeatability, and a better surface finish; however the tradeoff is a very slow machining rate. Due to the slow machining rate in WEDM, machining tasks take many hours depending on the complexity of the job. Because of this, users of WEDM try to predict machining rate beforehand so that input parameter values can be pre-programmed to achieve automated machining. However, partial success with traditional methodologies such as thermal modeling, artificial neural networks, mathematical, statistical, and empirical models left this problem still open for further research and exploration of alternative methods. Also, earlier efforts in applying the decision tree rule induction algorithms for predicting the machining rate in WEDM had limitations such as use of coarse grained method of discretizing the target and exploration of only C4.5 as the learning algorithm. The goal of this dissertation was to address the limitations reported in literature in using decision tree rule induction algorithms for WEDM. In this study, the three decision tree inductive algorithms C5.0, CART and CHAID have been applied for predicting material removal rate when the target was discretized into varied number of classes (two, three, four, and five classes) by three discretization methods. There were a total of 36 distinct combinations when learning algorithms, discretization methods, and number of classes in the target are combined. All of these 36 models have been developed and evaluated based on the prediction accuracy. From this research, a total of 21 models found to be suitable for WEDM that have prediction accuracy ranging from 71.43% through 100%. The models indentified in the current study not only achieved better prediction accuracy compared to previous studies, but also allows the users to have much better control over WEDM than what was previously possible. Application of inductive learning and development of suitable predictive models for WEDM by incorporating varied number of classes in the target, different learning algorithms, and different discretization methods have been the major contribution of this research.

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Fernandes, Fabiano Rodrigues. "Emprego de diferentes algoritmos de árvores de decisão na classificação da atividade celular in vitro para tratamentos de superfícies de titânio". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/165456.

Texto completo

Resumen

O interesse pela área de análise e caracterização de materiais biomédicos cresce, devido a necessidade de selecionar de forma adequada, o material a ser utilizado. Dependendo das condições em que o material será submetido, a caracterização poderá abranger a avaliação de propriedades mecânicas, elétricas, bioatividade, imunogenicidade, eletrônicas, magnéticas, ópticas, químicas e térmicas. A literatura relata o emprego da técnica de árvores de decisão, utilizando os algoritmos SimpleCart(CART) e J48, para classificação de base de dados (dataset), gerada a partir de resultados de artigos científicos. Esse estudo foi realizado afim de identificar características superficiais que otimizassem a atividade celular. Para isso, avaliou-se, a partir de artigos publicados, o efeito de tratamento de superfície do titânio na atividade celular in vitro (células MC3TE-E1). Ficou constatado que, o emprego do algoritmo SimpleCart proporcionou uma melhor resposta em relação ao algoritmo J48. Nesse contexto, o presente trabalho tem como objetivo aplicar, para esse mesmo estudo, os algoritmos CHAID (Chi-square iteration automatic detection) e CHAID Exaustivo, comparando com os resultados obtidos com o emprego do algoritmo SimpleCart. A validação dos resultados, mostraram que o algoritmo CHAID Exaustivo obteve o melhor resultado em comparação ao algoritmo CHAID, obtendo uma estimativa de acerto de 75,9% contra 58,6% respectivamente, e um erro padrão de 7,9% contra 9,1% respectivamente, enquanto que, o algoritmo já testado na literatura SimpleCart(CART) teve como resultado 34,5% de estimativa de acerto com um erro padrão de 8,8%. Com relação aos tempos de execução apurados sobre 22 mil registros, evidenciaram que o algoritmo CHAID Exaustivo apresentou os melhores tempos, com ganho de 0,02 segundos sobre o algoritmo CHAID e 14,45 segundos sobre o algoritmo SimpleCart(CART).
The interest for the area of analysis and characterization of biomedical materials as the need for selecting the adequate material to be used increases. However, depending on the conditions to which materials are submitted, characterization may involve the evaluation of mechanical, electrical, optical, chemical and thermal properties besides bioactivity and immunogenicity. Literature review shows the application decision trees, using SimpleCart(CART) and J48 algorithms, to classify the dataset, which is generated from the results of scientific articles. Therefore the objective of this study was to identify surface characteristics that optimizes the cellular activity. Based on published articles, the effect of the surface treatment of titanium on the in vitro cells (MC3TE-E1 cells) was evaluated. It was found that applying SimpleCart algorithm gives better results than the J48. In this sense, the present study has the objective to apply the CHAID (Chi-square iteration automatic detection) algorithm and Exhaustive CHAID to the surveyed data, and compare the results obtained with the application of SimpleCart algorithm. The validation of the results showed that the Exhaustive CHAID obtained better results comparing to CHAID algorithm, obtaining 75.9 % of accurate estimation against 58.5%, respectively, while the standard error was 7.9% against 9.1%, respectively. Comparing the obtained results with SimpleCart(CART) results which had already been tested and presented in the literature, the results for accurate estimation was 34.5% and the standard error 8.8%. In relation to execution time found through the 22.000 registers, it showed that the algorithm Exhaustive CHAID presented the best times, with a gain of 0.02 seconds over the CHAID algorithm and 14.45 seconds over the SimpleCart(CART) algorithm.

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Kassim, M. E. "Elliptical cost-sensitive decision tree algorithm (ECSDT)". Thesis, University of Salford, 2018. http://usir.salford.ac.uk/47191/.

Texto completo

Resumen

Cost-sensitive multiclass classification problems, in which the task of assessing the impact of the costs associated with different misclassification errors, continues to be one of the major challenging areas for data mining and machine learning. The literature reviews in this area show that most of the cost-sensitive algorithms that have been developed during the last decade were developed to solve binary classification problems where an example from the dataset will be classified into only one of two available classes. Much of the research on cost-sensitive learning has focused on inducing decision trees, which are one of the most common and widely used classification methods, due to the simplicity of constructing them, their transparency and comprehensibility. A review of the literature shows that inducing nonlinear multiclass cost-sensitive decision trees is still in its early stages and further research could result in improvements over the current state of the art. Hence, this research aims to address the following question: 'How can non-linear regions be identified for multiclass problems and utilized to construct decision trees so as to maximize the accuracy of classification, and minimize misclassification costs?' This research addresses this problem by developing a new algorithm called the Elliptical Cost-Sensitive Decision Tree algorithm (ECSDT) that induces cost-sensitive non-linear (elliptical) decision trees for multiclass classification problems using evolutionary optimization methods such as particle swarm optimization (PSO) and Genetic Algorithms (GAs). In this research, ellipses are used as non-linear separators, because of their simplicity and flexibility in drawing non-linear boundaries by modifying and adjusting their size, location and rotation towards achieving optimal results. The new algorithm was developed, tested, and evaluated in three different settings, each with a different objective function. The first considered maximizing the accuracy of classification only; the second focused on minimizing misclassification costs only, while the third considered both accuracy and misclassification cost together. ECSDT was applied to fourteen different binary-class and multiclass data sets and the results have been compared with those obtained by applying some common algorithms from Weka to the same datasets such as J48, NBTree, MetaCost, and the CostSensitiveClassifier. The primary contribution of this research is the development of a new algorithm that shows the benefits of utilizing elliptical boundaries for cost-sensitive decision tree learning. The new algorithm is capable of handling multiclass problems and an empirical evaluation shows good results. More specifically, when considering accuracy only, ECSDT performs better in terms of maximizing accuracy on 10 out of the 14 datasets, and when considering minimizing misclassification costs only, ECSDT performs better on 10 out of the 14 datasets, while when considering both accuracy and misclassification costs, ECSDT was able to obtain higher accuracy on 10 out of the 14 datasets and minimize misclassification costs on 5 out of the 14 datasets. The ECSDT also was able to produce smaller trees when compared with J48, LADTree and ADTree.

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Shi, Haijian. "Best-first Decision Tree Learning". The University of Waikato, 2007. http://hdl.handle.net/10289/2317.

Texto completo

Resumen

In best-first top-down induction of decision trees, the best split is added in each step (e.g. the split that maximally reduces the Gini index). This is in contrast to the standard depth-first traversal of a tree. The resulting tree will be the same, just how it is built is different. The objective of this project is to investigate whether it is possible to determine an appropriate tree size on practical datasets by combining best-first decision tree growth with cross-validation-based selection of the number of expansions that are performed. Pre-pruning, post-pruning, CART-pruning can be performed this way to compare.

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Girardini, Davide <1985&gt. "Efficient implementation of Treant: a robust decision tree learning algorithm". Master's Degree Thesis, Università Ca' Foscari Venezia, 2020. http://hdl.handle.net/10579/17423.

Texto completo

Resumen

The thesis focuses on the optimization of an existing algorithm called Treant for the generation of robust decision trees. Despite its good performances from the machine learning point of view, unfortunately, the code presented some strong limitations when employed with big datasets. The algorithm was originally written in Python, a very good programming language for fast prototyping but, as well as many other interpreted languages, it can lead to poor performances when it is asked to crunch a big amount of numbers if not supported by appropriated libraries. The code has been translated to the C++ compiled language, it has been parallelized with the OpenMP library, along with other optimizations regarding the memory management and the choice of third party libraries. A python module has been generated from the C++ code in order to expose an interface for the efficient C++ classes and use them as native Python classes. In this way, any python user can exploit both the Python flexibility and the C++ performances.

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Trivedi, Ankit P. "Decision tree-based machine learning algorithm for in-node vehicle classification". Thesis, California State University, Long Beach, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10196455.

Texto completo

Resumen

This paper proposes an in-node microprocessor-based vehicle classification approach to analyze and determine the types of vehicles passing over a 3-axis magnetometer sensor. The approach for vehicle classification utilizes J48 classification algorithm implemented in Weka (a machine learning software suite). J48 is Quinlan's C4.5 algorithm, an extension of decision tree machine learning based on an ID3 algorithm. The decision tree model is generated from a set of features extracted from vehicles passing over the 3-axis sensor. The features are attributes provided with correct classifications to the J48 training algorithm to generate a decision tree model with varying degrees of classification rates based on cross-validation. Ideally, using fewer attributes to generate the model allows for the highest computational efficiency due to fewer features needed to be calculated while minimalizing the tree with fewer branches. The generated tree model can then be easily implemented using nested if-loops in any language on a multitude of microprocessors. Also, setting an adaptive baseline to negate the effects of the background magnetic field allows reuse of the same tree model in multiple environments. The result of the experiment shows that the vehicle classification system is effective and efficient.

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Krook, Jonatan. "Predicting low airfares with time series features and a decision tree algorithm". Thesis, Uppsala universitet, Statistiska institutionen, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-353274.

Texto completo

Resumen

Airlines try to maximize revenue by letting prices of tickets vary over time. This fluctuation contains patterns that can be exploited to predict price lows. In this study, we create an algorithm that daily decides whether to buy a certain ticket or wait for the price to go down. For creation and evaluation, we have used data from searches made online for flights on the route Stockholm – New York during 2017 and 2018. The algorithm is based on time series features selected by a decision tree and clearly outperforms the selected benchmarks.

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Jeenanunta, Chawalit. "The Approach-dependent, Time-dependent, Label-constrained Shortest Path Problem and Enhancements for the CART Algorithm with Application to Transportation Systems". Diss., Virginia Tech, 2004. http://hdl.handle.net/10919/27773.

Texto completo

Resumen

In this dissertation, we consider two important problems pertaining to the analysis of transportation systems. The first of these is an approach-dependent, time-dependent, label-constrained shortest path problem that arises in the context of the Route Planner Module of the Transportation Analysis Simulation System (TRANSIMS), which has been developed by the Los Alamos National Laboratory for the Federal Highway Administration. This is a variant of the shortest path problem defined on a transportation network comprised of a set of nodes and a set of directed arcs such that each arc has an associated label designating a mode of transportation, and an associated travel time function that depends on the time of arrival at the tail node, as well as on the node via which this node was approached. The lattermost feature is a new concept injected into the time-dependent, label-constrained shortest path problem, and is used to model turn-penalties in transportation networks. The time spent at an intersection before entering the next link would depend on whether we travel straight through the intersection, or make a right turn at it, or make a left turn at it. Accordingly, we model this situation by incorporating within each link's travel time function a dependence on the link via which its tail node was approached. We propose two effective algorithms to solve this problem by adapting two efficient existing algorithms to handle time dependency and label constraints: the Partitioned Shortest Path (PSP) algorithm and the Heap-Dijkstra (HP-Dijkstra) algorithm, and present related theoretical complexity results. In addition, we also explore various heuristic methods to curtail the search. We explore an Augmented Ellipsoidal Region Technique (A-ERT) and a Distance-Based A-ERT, along with some variants to curtail the search for an optimal path between a given origin and destination to more promising subsets of the network. This helps speed up computation without sacrificing optimality. We also incorporate an approach-dependent delay estimation function, and in concert with a search tree level-based technique, we derive a total estimated travel time and use this as a key to prioritize node selections or to sort elements in the heap. As soon as we reach the destination node, while it is within some p% of the minimum key value of the heap, we then terminate the search. We name the versions of PSP and HP-Dijkstra that employ this method as Early Terminated PSP (ET-PSP) and Early Terminated Heap-Dijkstra (ETHP-Dijkstra) algorithms. All of these procedures are compared with the original Route Planner Module within TRANSIMS, which is implemented in the Linux operating system, using C++ along with the g++ GNU compiler. Extensive computational testing has been conducted using available data from the Portland, Oregon, and Blacksburg, Virginia, transportation networks to investigate the efficacy of the developed procedures. In particular, we have tested twenty-five different combinations of network curtailment and algorithmic strategies on three test networks: the Blacksburg-light, the Blacksburg-full, and the BigNet network. The results indicate that the Heap-Dijkstra algorithm implementations are much faster than the PSP algorithmic approaches for solving the underlying problem exactly. Furthermore, mong the curtailment schemes, the ETHP-Dijkstra with p=5%, yields the best overall results. This method produces solutions within 0.37-1.91% of optimality, while decreasing CPU effort by 56.68% at an average, as compared with applying the best available exact algorithm. The second part of this dissertation is concerned with the Classification and Regression Tree (CART) algorithm, and its application to the Activity Generation Module of TRANSIMS. The CART algorithm has been popularly used in various contexts by transportation engineers and planners to correlate a set of independent household demographic variables with certain dependent activity or travel time variables. However, the algorithm lacks an automated mechanism for deriving classification trees based on optimizing specified objective functions and handling desired side-constraints that govern the structure of the tree and the statistical and demographic nature of its leaf nodes. Using a novel set partitioning formulation, we propose new tree development, and more importantly, optimal pruning strategies to accommodate the consideration of such objective functions and side-constraints, and establish the theoretical validity of our approach. This general enhancement of the CART algorithm is then applied to the Activity Generator module of TRANSIMS. Related computational results are presented using real data pertaining to the Portland, Oregon, and Blacksburg, Virginia, transportation networks to demonstrate the flexibility and effectiveness of the proposed approach in classifying data, as well as to examine its numerical performance. The results indicate that a variety of objective functions and constraints can be readily accommodated to efficiently control the structural information that is captured by the developed classification tree as desired by the planner or analyst, dependent on the scope of the application at hand.
Ph. D.

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Feychting, Sara. "Incredible tweets : Automated credibility analysis in Twitter feeds using an alternating decision tree algorithm". Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186711.

Texto completo

Resumen

This project investigates how to determine the credibility of a tweet without using human perception. Information about the user and the tweet is studied in search for correlations between their properties and the credibility of the tweet. An alternating decision tree is created to automatically determine the credibility of tweets. Some features are found to correlate to the credibility of the tweets, amongst which the number of previous tweets by a user and the use of uppercase characters are the most prominent.

Los estilos APA, Harvard, Vancouver, ISO, etc.

11

Doubleday, Kevin. "Generation of Individualized Treatment Decision Tree Algorithm with Application to Randomized Control Trials and Electronic Medical Record Data". Thesis, The University of Arizona, 2016. http://hdl.handle.net/10150/613559.

Texto completo

Resumen

With new treatments and novel technology available, personalized medicine has become a key topic in the new era of healthcare. Traditional statistical methods for personalized medicine and subgroup identification primarily focus on single treatment or two arm randomized control trials (RCTs). With restricted inclusion and exclusion criteria, data from RCTs may not reflect real world treatment effectiveness. However, electronic medical records (EMR) offers an alternative venue. In this paper, we propose a general framework to identify individualized treatment rule (ITR), which connects the subgroup identification methods and ITR. It is applicable to both RCT and EMR data. Given the large scale of EMR datasets, we develop a recursive partitioning algorithm to solve the problem (ITR-Tree). A variable importance measure is also developed for personalized medicine using random forest. We demonstrate our method through simulations, and apply ITR-Tree to datasets from diabetes studies using both RCT and EMR data. Software package is available at https://github.com/jinjinzhou/ITR.Tree.

Los estilos APA, Harvard, Vancouver, ISO, etc.

12

VANCE, DANNY W. "AN ALL-ATTRIBUTES APPROACH TO SUPERVISED LEARNING". University of Cincinnati / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1162335608.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

13

Santos, Ernani Possato dos. "Análise de crédito com segmentação da carteira, modelos de análise discriminante, regressão logística e classification and regression trees (CART)". Universidade Presbiteriana Mackenzie, 2015. http://tede.mackenzie.br/jspui/handle/tede/970.

Texto completo

Resumen

Made available in DSpace on 2016-03-15T19:32:56Z (GMT). No. of bitstreams: 1 Ernani Possato dos Santosprot.pdf: 2286270 bytes, checksum: 96bb14c147c5baa96f3ae6ca868056d6 (MD5) Previous issue date: 2015-08-14
The credit claims to be one of the most important tools to trigger and move the economic wheel. Once it is well used it will bring benefits on a large scale to society; although if it is used without any balance it might bring loss to the banks, companies, to governments and also to the population. In relation to this context it becomes fundamental to evaluate models of credit capable of anticipating processses of default with an adequate degree of accuracy so as to avoid or at least to reduce the risk of credit. This study also aims to evaluate three credit risk models, being two parametric models, discriminating analysis and logistic regression, and one non-parametric, decision tree, aiming to check the accuracy of them, before and after the segmentation of such sample through the criteria of costumer s size. This research relates to an applied study about Industry BASE.
O crédito se configura em uma das mais importantes ferramentas para alavancar negócios e girar a roda da economia. Se bem utilizado, trará benefícios em larga escala à sociedade, porém, se utilizado sem equilíbrio, poderá trazer prejuízos, também em larga escala, a bancos, a empresas, aos governos e aos cidadãos. Em função deste contexto, é precípuo avaliar modelos de crédito capazes de prever, com grau adequado de acurácia, processos de default, a fim de se evitar ou, pelo menos, reduzir o risco de crédito. Este estudo tem como finalidade avaliar três modelos de análise do risco de crédito, sendo dois modelos paramétricos, análise discriminante e regressão logística, e um não-paramétrico, árvore de decisão, em que se avaliou a acurácia destes modelos, antes e após a segmentação da amostra desta pesquisa por meio do critério de porte dos clientes. Esta pesquisa se refere a um estudo aplicado sobre a Indústria BASE.

Los estilos APA, Harvard, Vancouver, ISO, etc.

14

Gerdes, Mike. "Predictive Health Monitoring for Aircraft Systems using Decision Trees". Licentiate thesis, Linköpings universitet, Fluida och mekatroniska system, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-105843.

Texto completo

Resumen

Unscheduled aircraft maintenance causes a lot problems and costs for aircraft operators. This is due to the fact that aircraft cause significant costs if flights have to be delayed or canceled and because spares are not always available at any place and sometimes have to be shipped across the world. Reducing the number of unscheduled maintenance is thus a great costs factor for aircraft operators. This thesis describes three methods for aircraft health monitoring and prediction; one method for system monitoring, one method for forecasting of time series and one method that combines the two other methods for one complete monitoring and prediction process. Together the three methods allow the forecasting of possible failures. The two base methods use decision trees for decision making in the processes and genetic optimization to improve the performance of the decision trees and to reduce the need for human interaction. Decision trees have the advantage that the generated code can be fast and easily processed, they can be altered by human experts without much work and they are readable by humans. The human readability and modification of the results is especially important to include special knowledge and to remove errors, which the automated code generation produced.

Los estilos APA, Harvard, Vancouver, ISO, etc.

15

Johansson, Viktor. "A sensor orientation and signal preprocessing study of a personal fall detection algorithm". Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-21375.

Texto completo

Resumen

This study investigates if a smartphones orientation in the pocket affects the result of a decision tree model trained with data from personal falls, and also how a low-pass filter affects these results. A comparison is made between the results gathered from this study, compared to previous studies and products within the field. The data was gathered using a smartphone application and was later split up to get datasets for all the different orientations of the smartphone. Before training the models, the data was processed through a low pass filter. Results showed that low pass filtered signals generally performed better and that two of the trained models, could outscore at least one other algorithm cited in this thesis in at least one category. However, existing products on the market that were investigated do not disclose their statistics and a comparison to these products could not be made. The best two orientations for the phone to be placed in the pocket was when the face of the phone was pointing out from the leg, and top of the phone was pointing up and also when the face of the phone was pointing out from the leg, and the top of the phone was pointing down.

Los estilos APA, Harvard, Vancouver, ISO, etc.

16

Vinnemeier, Christof David [Verfasser], Jürgen [Akademischer Betreuer] May, Uwe [Akademischer Betreuer] Groß y Tim [Akademischer Betreuer] Friede. "Establishment of a clinical algorithm for the diagnosis of P. falciparum malaria in children from an endemic area using a Classification and Regression Tree (CART) model / Christof David Vinnemeier. Gutachter: Uwe Groß ; Tim Friede. Betreuer: Jürgen May". Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2015. http://d-nb.info/1065882017/34.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

17

McNamara, Nathan Patrick. "Using Decision Trees to Predict Intent to Use Passive Occupational Exoskeletons in Manufacturing Tasks". Ohio University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1605720844135027.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

18

Astapenko, D. "Automated system design optimisation". Thesis, Loughborough University, 2010. https://dspace.lboro.ac.uk/2134/6863.

Texto completo

Resumen

The focus of this thesis is to develop a generic approach for solving reliability design optimisation problems which could be applicable to a diverse range of real engineering systems. The basic problem in optimal reliability design of a system is to explore the means of improving the system reliability within the bounds of available resources. Improving the reliability reduces the likelihood of system failure. The consequences of system failure can vary from minor inconvenience and cost to significant economic loss and personal injury. However any improvements made to the system are subject to the availability of resources, which are very often limited. The objective of the design optimisation problem analysed in this thesis is to minimise system unavailability (or unreliability if an unrepairable system is analysed) through the manipulation and assessment of all possible design alterations available, which are subject to constraints on resources and/or system performance requirements. This thesis describes a genetic algorithm-based technique developed to solve the optimisation problem. Since an explicit mathematical form can not be formulated to evaluate the objective function, the system unavailability (unreliability) is assessed using the fault tree method. Central to the optimisation algorithm are newly developed fault tree modification patterns (FTMPs). They are employed here to construct one fault tree representing all possible designs investigated, from the initial system design specified along with the design choices. This is then altered to represent the individual designs in question during the optimisation process. Failure probabilities for specified design cases are quantified by employing Binary Decision Diagrams (BDDs). A computer programme has been developed to automate the application of the optimisation approach to standard engineering safety systems. Its practicality is demonstrated through the consideration of two systems of increasing complexity; first a High Integrity Protection System (HIPS) followed by a Fire Water Deluge System (FWDS). The technique is then further-developed and applied to solve problems of multi-phased mission systems. Two systems are considered; first an unmanned aerial vehicle (UAV) and secondly a military vessel. The final part of this thesis focuses on continuing the development process by adapting the method to solve design optimisation problems for multiple multi-phased mission systems. Its application is demonstrated by considering an advanced UAV system involving multiple multi-phased flight missions. The applications discussed prove that the technique progressively developed in this thesis enables design optimisation problems to be solved for systems with different levels of complexity. A key contribution of this thesis is the development of a novel generic optimisation technique, embedding newly developed FTMPs, which is capable of optimising the reliability design for potentially any engineering system. Another key and novel contribution of this work is the capability to analyse and provide optimal design solutions for multiple multi-phase mission systems. Keywords: optimisation, system design, multi-phased mission system, reliability, genetic algorithm, fault tree, binary decision diagram

Los estilos APA, Harvard, Vancouver, ISO, etc.

19

Khan, Kashif. "A distributed computing architecture to enable advances in field operations and management of distributed infrastructure". Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/a-distributed-computing-architecture-to-enable-advances-in-field-operations-and-management-of-distributed-infrastructure(a9181e99-adf3-47cb-93e1-89d267219e50).html.

Texto completo

Resumen

Distributed infrastructures (e.g., water networks and electric Grids) are difficult to manage due to their scale, lack of accessibility, complexity, ageing and uncertainties in knowledge of their structure. In addition they are subject to loads that can be highly variable and unpredictable and to accidental events such as component failure, leakage and malicious tampering. To support in-field operations and central management of these infrastructures, the availability of consistent and up-to-date knowledge about the current state of the network and how it would respond to planned interventions is argued to be highly desirable. However, at present, large-scale infrastructures are “data rich but knowledge poor”. Data, algorithms and tools for network analysis are improving but there is a need to integrate them to support more directly engineering operations. Current ICT solutions are mainly based on specialized, monolithic and heavyweight software packages that restrict the dissemination of dynamic information and its appropriate and timely presentation particularly to field engineers who operate in a resource constrained and less reliable environments. This thesis proposes a solution to these problems by recognizing that current monolithic ICT solutions for infrastructure management seek to meet the requirements of different human roles and operating environments (defined in this work as field and central sides). It proposes an architectural approach to providing dynamic, predictive, user-centric, device and platform independent access to consistent and up-to-date knowledge. This architecture integrates the components required to implement the functionalities of data gathering, data storage, simulation modelling, and information visualization and analysis. These components are tightly coupled in current implementations of software for analysing the behaviour of networks. The architectural approach, by contrast, requires they be kept as separate as possible and interact only when required using common and standard protocols. The thesis particularly concentrates on engineering practices in clean water distribution networks but the methods are applicable to other structural networks, for example, the electricity Grid. A prototype implementation is provided that establishes a dynamic hydraulic simulation model and enables the model to be queried via remote access in a device and platform independent manner.This thesis provides an extensive evaluation comparing the architecture driven approach with current approaches, to substantiate the above claims. This evaluation is conducted by the use of benchmarks that are currently published and accepted in the water engineering community. To facilitate this evaluation, a working prototype of the whole architecture has been developed and is made available under an open source licence.

Los estilos APA, Harvard, Vancouver, ISO, etc.

20

Rodríguez, Elen Yanina Aguirre. "Técnicas de aprendizado de máquina para predição do custo da logística de transporte : uma aplicação em empresa do segmento de autopeças /". Guaratinguetá, 2020. http://hdl.handle.net/11449/192326.

Texto completo

Resumen

Orientador: Fernando Augusto Silva Marins
Resumo: Em diferentes aspectos da vida cotidiana, o ser humano é forçado a escolher entre várias opções, esse processo é conhecido como tomada de decisão. No nível do negócio, a tomada de decisões desempenha um papel muito importante, porque dessas decisões depende o sucesso ou o fracasso das organizações. No entanto, em muitos casos, tomar decisões erradas pode gerar grandes custos. Desta forma, alguns dos problemas de tomada de decisão que um gerente enfrenta comumente são, por exemplo, a decisão para determinar um preço, a decisão de comprar ou fabricar, em problemas de logística, problemas de armazenamento, etc. Por outro lado, a coleta de dados tornou-se uma vantagem competitiva, pois pode ser utilizada para análise e extração de resultados significativos por meio da aplicação de diversas técnicas, como estatística, simulação, matemática, econometria e técnicas atuais, como aprendizagem de máquina para a criação de modelos preditivos. Além disso, há evidências na literatura de que a criação de modelos com técnicas de aprendizagem de máquina têm um impacto positivo na indústria e em diferentes áreas de pesquisa. Nesse contexto, o presente trabalho propõe o desenvolvimento de um modelo preditivo para tomada de decisão, usando as técnicas supervisionadas de aprendizado de máquina, e combinando o modelo gerado com as restrições pertencentes ao processo de otimização. O objetivo da proposta é treinar um modelo matemático com dados históricos de um processo decisório e obter os predit... (Resumo completo, clicar acesso eletrônico abaixo)
Mestre

Los estilos APA, Harvard, Vancouver, ISO, etc.

21

Odeh, Khaled. "Nouveaux algorithmes pour le traitement probabiliste et logique des arbres de défaillance". Compiègne, 1995. http://www.theses.fr/1995COMPD846.

Texto completo

Resumen

L'arbre de défaillance est un outil majeur dans les études de sûreté de fonctionnement des systèmes complexes. Ce modèle représente graphiquement les combinaisons d'événements conduisant à la réalisation de l'événement indésirable. Les analyses qualitative et quantitative de ce modèle sont des problèmes NP-difficiles. Notre but est de développer de nouveaux algorithmes et d'améliorer les algorithmes existants pour que l'on puisse effectuer ces analyses sur des gros arbres de défaillance. Nous proposons un algorithme de factorisation par rapport aux événements répétés et complémentaires afin de calculer la probabilité d'occurrence de l'événement indésirable. Le diagramme de décision binaire s'est avéré efficace dans le traitement des gros arbres de défaillance. Utilisant cette représentation, nous généralisons aux arbres de défaillance non-cohérents un algorithme de calcul des coupes minimales et nous proposons un algorithme de calcul du facteur d'importance de Birnbaum. Dans la continuité de l'étude précédente, nous nous sommes intéressés à la maintenance des composants dans les systèmes complexes. Nous étudions les principales politiques de maintenance systématique et proposons une résolution numérique de l'optimisation de la politique de remplacement en bloc. Nous appliquons cette dernière sur un exemple d'arbre de défaillance.

Los estilos APA, Harvard, Vancouver, ISO, etc.

22

Pazúriková, Jana. "Adaptivní model pro simulaci znečištění ovzduší". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236487.

Texto completo

Resumen

Air pollution harms the environment and human welfare. Computer models and their simulation are useful tools for deeper understanding of processes behind as they quite accurately represent the dispersion and transformation of pollutants with advection diffusion equation or by other concepts. Current models give valid results only to constrained cases of initial conditions. The general model combining the several specific models which is able to change according to input parametres and improve with training is proposed. The adaptiveness of the system is provided by decision tree as data structure with information for selection and combination process and genetic algorithm as optimization method for adjusting the tree. The evaluation of implemented system proves that the combination of models gives better results than models themselves. Even with simple specific models, the system has achieved results comparable to state-of-art models of air pollution.

Los estilos APA, Harvard, Vancouver, ISO, etc.

23

Cazzolato, Mirela Teixeira. "Classificação de data streams utilizando árvore de decisão estatística e a teoria dos fractais na análise evolutiva dos dados". Universidade Federal de São Carlos, 2014. https://repositorio.ufscar.br/handle/ufscar/565.

Texto completo

Resumen

Made available in DSpace on 2016-06-02T19:06:13Z (GMT). No. of bitstreams: 1 5984.pdf: 1962060 bytes, checksum: d943b973e9dd5f12ab87985f7388cb80 (MD5) Previous issue date: 2014-03-24
Financiadora de Estudos e Projetos
A data stream is generated in a fast way, continuously, ordered, and in large quantities. To process data streams there must be considered, among others factors, the limited use of memory, the need of real-time processing, the accuracy of the results and the concept drift (which occurs when there is a change in the concept of the data being analyzed). Decision tree is a popular form of representation of the classifier, that is intuitive and fast to build, generally obtaining high accuracy. The techniques of incremental decision trees present in the literature generally have high computational costs to construct and update the model, especially regarding the calculation to split the decision nodes. The existent methods have a conservative characteristic to deal with limited amounts of data, tending to improve their results as the number of examples increases. Another problem is that many real-world applications generate data with noise, and the existing techniques have a low tolerance to these events. This work aims to develop decision tree methods for data streams, that supply the deficiencies of the current state of the art. In addition, another objective is to develop a technique to detect concept drift using the fractal theory. This functionality should indicate when there is a need to correct the model, allowing the adequate description of most recent events. To achieve the objectives, three decision tree algorithms were developed: StARMiner Tree, Automatic StARMiner Tree, and Information Gain StARMiner Tree. These algorithms use a statistical method as heuristic to split the nodes, which is not dependent on the number of examples and is fast. In the experiments the algorithms achieved high accuracy, also showing a tolerant behavior in the classification of noisy data. Finally, a drift detection method was proposed to detect changes in the data distribution, based on the fractal theory. The method, called Fractal Detection Method, detects significant changes on the data distribution, causing the model to be updated when it does not describe the data (becoming obsolete). The method achieved good results in the classification of data containing concept drift, proving to be suitable for evolutionary analysis of data.
Um data stream e gerado de forma rápida, contínua, ordenada e em grande quantidade. Para o processamento de data streams deve-se considerar, dentre outros fatores, o uso limitado de memoria, a necessidade de processamento em tempo real, a precisão dos resultados e o concept drift (que ocorre quando há uma mudança no conceito dos dados que estão sendo analisados). À arvore de decisão e uma popular forma de representação do modelo classificador, intuitiva, e rápida de construir, geralmente possuindo alta acurada. Às técnicas de arvores de decisão incrementais presentes na literatura geralmente apresentam um alto custo computacional para a construção e atualização do modelo, principalmente no que se refere ao calculo para a decisão de divisão dos nós. Os métodos existentes possuem uma característica conservadora para lidar com quantidades de dados limitadas, tendendo a melhorar seus resultados conforme o número de exemplos aumenta. Outro problema e a geração dos dados com ruídos por muitas aplicações reais, pois as técnicas existentes possuem baixa tolerância a essas ocorrências. Este trabalho tem como objetivo o desenvolvimento de métodos de arvores de decisão para data streams, que suprem as deficiências do atual estado da arte. Além disso, outro objetivo deste projeto e o desenvolvimento de uma funcionalidade para detecção de concept drift utilizando a teoria dos fractais, corrigindo o modelo sempre que necessário, possibilitando a descrição correta dos acontecimentos mais recentes dos dados. Para atingir os objetivos foram desenvolvidos três algoritmos de arvore de decisão: o StÀRMiner Tree, o Àutomatic StÀRMiner Tree, e o Information Gain StÀR-Miner Tree. Esses algoritmos utilizam um método estatístico como heurística de divisão de nós, que não é dependente do numero de exemplos lidos e que e rápida. Os algoritmos obtiveram alta acurácia nos experimentos realizados, mostrando também um comportamento tolerante na classificação de dados ruidosos. Finalmente, foi proposto um método para a detecção de mudanças no comportamento dos dados baseado na teoria dos fractais, o Fractal Drift Detection Method. Ele detecta mudanças significativas na distribuicao dos dados, fazendo com que o modelo seja atualizado sempre que o mesmo não descrever os dados atuais (se tornar obsoleto). O método obteve bons resultados na classificação de dados contendo concept drift, mostrando ser adequado para a análise evolutiva dos dados.

Los estilos APA, Harvard, Vancouver, ISO, etc.

24

Bridgstock, Ruth Sarah. "Success in the protean career : a predictive study of professional artists and tertiary arts graduates". Thesis, Queensland University of Technology, 2007. https://eprints.qut.edu.au/16575/1/Ruth_Bridgstock_Thesis.pdf.

Texto completo

Resumen

In the shift to a globalised creative economy where innovation and creativity are increasingly prized, many studies have documented direct and indirect social and economic benefits of the arts. In addition, arts workers have been argued to possess capabilities which are of great benefit both within and outside the arts, including (in addition to creativity) problem solving abilities, emotional intelligence, and team working skills (ARC Centre of Excellence for Creative Industries and Innovation, 2007). However, the labour force characteristics of professional artists in Australia and elsewhere belie their importance. The average earnings of workers in the arts sector are consistently less than other workers with similar educational backgrounds, and their rates of unemployment and underemployment are much higher (Australian Bureau of Statistics, 2005; Caves, 2000; Throsby & Hollister, 2003). Graduating students in the arts appear to experience similar employment challenges and exhibit similar patterns of work to artists in general. Many eventually obtain work unrelated to the arts or go back to university to complete further tertiary study in fields unrelated to arts (Graduate Careers Council of Australia, 2005a). Recent developments in career development theory have involved discussion of the rise of boundaryless careers amongst knowledge workers. Boundaryless careers are characterised by non-linear career progression occurring outside the bounds of a single organisation or field (Arthur & Rousseau, 1996a, 1996b). The protean career is an extreme form of the boundaryless career, where the careerist also possesses strong internal career motivations and criteria for success (Baruch, 2004; Hall, 2004; Hall & Mirvis, 1996). It involves a psychological contract with one's self rather than an organisation or organisations. The boundaryless and protean career literature suggests competencies and dispositions for career self-management and career success, but to date there has been minimal empirical work investigating the predictive value of these competencies and dispositions to career success in the boundaryless or protean career. This program of research employed competencies and dispositions from boundaryless and protean career theory to predict career success in professional artists and tertiary arts graduates. These competencies and dispositions were placed into context using individual and contextual career development influences suggested by the Systems Theory Framework of career development (McMahon & Patton, 1995; Patton & McMahon, 1999, 2006a). Four substantive studies were conducted, using online surveys with professional artists and tertiary arts students / graduates, which were preceded by a pilot study for measure development. A largely quantitative approach to the program of research was preferred, in the interests of generalisability of findings. However, at the time of data collection, there were no quantitative measures available which addressed the constructs of interest. Brief scales of Career Management Competence based on the Australian Blueprint for Career Development (Haines, Scott, & Lincoln, 2003), Protean Career Success Orientation based on the underlying dispositions for career success suggested by protean career theory, and Career Development Influences based on the Systems Theory Framework of career development (McMahon & Patton, 1995; Patton & McMahon, 1999, 2006a) were constructed and validated via a process of pilot testing and exploratory factor analyses. This process was followed by confirmatory factor analyses with data collected from two samples: 310 professional artists, and 218 graduating arts students who participated at time 1 (i.e., at the point of undergraduate course completion in October, 2005). Confirmatory factor analyses via Structural Equation Modelling conducted in Study 1 revealed that the scales would benefit from some respecification, and so modifications were made to the measures to enhance their validity and reliability. The three scales modified and validated in Study 1 were then used in Studies 3 and 4 as potential predictors of career success for the two groups of artists under investigation, along with relevant sociodemographic variables. The aim of the Study 2 was to explore the construct of career success in the two groups of artists studied. Each participant responded to an open-ended question asking them to define career success. The responses for professional artists were content analysed using emergent coding with two coders. The codebook was later applied to the arts students' definitions. The majority of the themes could be grouped into four main categories: internal definitions; financial recognition definitions; contribution definitions; and non-financial recognition definitions. Only one third of the definition themes in the professional artists' and arts graduates' definitions of career success were categorised as relating to financial recognition. Responses within the financial recognition category also indicated that many of the artists aspired only to a regular subsistence level of arts income (although a small number of the arts graduates did aspire to fame and fortune). The second section of the study investigated the statistical relationships between the five different measures of career success for each career success definitional category and overall. The professional artists' and arts graduates' surveys contained several measures of career success, including total earnings over the previous 12 months, arts earnings over the previous 12 months, 1-6 self-rated total employability, 1-6 self-rated arts employability, and 1-6 self-rated self-defined career success. All of the measures were found to be statistically related to one another, but a very strong statistical relationship was identified between each employability measure and its corresponding earnings measure for both of the samples. Consequently, it was decided to include only the earnings measures (earnings from arts, and earnings overall) and the self-defined career success rating measure in the later studies. Study 3 used the career development constructs validated in Study 1, sociodemographic variables, and the career success measures explored in Study 2 via Classification and Regression Tree (CART - Breiman, Friedman, Olshen, & Stone, 1984) style decision trees with v-fold crossvalidation pruning using the 1 SE rule. CART decision trees are a nonparametric analysis technique which can be used as an alternative to OLS or hierarchical regression in the case of data which violates parametric statistical assumptions. The three optimal decision trees for total earnings, arts earnings and self defined career success ratings explained a large proportion of the variance in their respective target variables (R2 between 0.49 and 0.68). The Career building subscale of the Career Management Competence scale, pertaining to the ability to manage the external aspects of a career, was the most consistent predictor of all three career success measures (and was the strongest predictor for two of the three trees), indicating the importance of the artists' abilities to secure work and build the external aspects of a career. Other important predictors included the Self management subscale of the Career Management Competence scale, Protean Career Success Orientation, length of time working in the arts, and the positive role of interpersonal influences, skills and abilities, and interests and beliefs from the Career Development Influences scale. Slightly different patterns of predictors were found for the three different career success measures. Study 4 also involved the career development constructs validated in Study 1, sociodemographic variables, and the career success measures explored in Study 2 via CART style decision trees. This study used a prospective repeated measures design where the data for the attribute variables were gathered at the point of undergraduate course completion, and the target variables were measured one year later. Data from a total of 122 arts students were used, as 122 of the 218 students who responded to the survey at time 1 (October 2005) also responded at time 2 (October 2006). The resulting optimal decision trees had R2 values of between 0.33 and 0.46. The values were lower than those for the professional artists' decision trees, and the trees themselves were smaller, but the R2 values nonetheless indicated that the arts students' trees possessed satisfactory explanatory power. The arts graduates' Career building scores at time 1 were strongly predictive of all three career success measures at time 2, a similar finding to the professional artists' trees. A further similarity between the trees for the two samples was the strong statistical relationship between Career building, Self management, and Protean Career Success Orientation. However, the most important variable in the total earnings tree was arts discipline category. Technical / design arts graduates consistently earned more overall than arts graduates from other disciplines. Other key predictors in the arts graduates' trees were work experience in arts prior to course completion, positive interpersonal influences, and the positive influence of skills and abilities and interests and beliefs on career development. The research program findings represent significant contributions to existing knowledge about artists' career development and success, and also the transition from higher education to the world of work, with specific reference to arts and creative industries programs. It also has implications for theory relating to career success and protean / boundaryless careers.

Los estilos APA, Harvard, Vancouver, ISO, etc.

25

Bridgstock, Ruth Sarah. "Success in the protean career : a predictive study of professional artists and tertiary arts graduates". Queensland University of Technology, 2007. http://eprints.qut.edu.au/16575/.

Texto completo

Resumen

In the shift to a globalised creative economy where innovation and creativity are increasingly prized, many studies have documented direct and indirect social and economic benefits of the arts. In addition, arts workers have been argued to possess capabilities which are of great benefit both within and outside the arts, including (in addition to creativity) problem solving abilities, emotional intelligence, and team working skills (ARC Centre of Excellence for Creative Industries and Innovation, 2007). However, the labour force characteristics of professional artists in Australia and elsewhere belie their importance. The average earnings of workers in the arts sector are consistently less than other workers with similar educational backgrounds, and their rates of unemployment and underemployment are much higher (Australian Bureau of Statistics, 2005; Caves, 2000; Throsby & Hollister, 2003). Graduating students in the arts appear to experience similar employment challenges and exhibit similar patterns of work to artists in general. Many eventually obtain work unrelated to the arts or go back to university to complete further tertiary study in fields unrelated to arts (Graduate Careers Council of Australia, 2005a). Recent developments in career development theory have involved discussion of the rise of boundaryless careers amongst knowledge workers. Boundaryless careers are characterised by non-linear career progression occurring outside the bounds of a single organisation or field (Arthur & Rousseau, 1996a, 1996b). The protean career is an extreme form of the boundaryless career, where the careerist also possesses strong internal career motivations and criteria for success (Baruch, 2004; Hall, 2004; Hall & Mirvis, 1996). It involves a psychological contract with one's self rather than an organisation or organisations. The boundaryless and protean career literature suggests competencies and dispositions for career self-management and career success, but to date there has been minimal empirical work investigating the predictive value of these competencies and dispositions to career success in the boundaryless or protean career. This program of research employed competencies and dispositions from boundaryless and protean career theory to predict career success in professional artists and tertiary arts graduates. These competencies and dispositions were placed into context using individual and contextual career development influences suggested by the Systems Theory Framework of career development (McMahon & Patton, 1995; Patton & McMahon, 1999, 2006a). Four substantive studies were conducted, using online surveys with professional artists and tertiary arts students / graduates, which were preceded by a pilot study for measure development. A largely quantitative approach to the program of research was preferred, in the interests of generalisability of findings. However, at the time of data collection, there were no quantitative measures available which addressed the constructs of interest. Brief scales of Career Management Competence based on the Australian Blueprint for Career Development (Haines, Scott, & Lincoln, 2003), Protean Career Success Orientation based on the underlying dispositions for career success suggested by protean career theory, and Career Development Influences based on the Systems Theory Framework of career development (McMahon & Patton, 1995; Patton & McMahon, 1999, 2006a) were constructed and validated via a process of pilot testing and exploratory factor analyses. This process was followed by confirmatory factor analyses with data collected from two samples: 310 professional artists, and 218 graduating arts students who participated at time 1 (i.e., at the point of undergraduate course completion in October, 2005). Confirmatory factor analyses via Structural Equation Modelling conducted in Study 1 revealed that the scales would benefit from some respecification, and so modifications were made to the measures to enhance their validity and reliability. The three scales modified and validated in Study 1 were then used in Studies 3 and 4 as potential predictors of career success for the two groups of artists under investigation, along with relevant sociodemographic variables. The aim of the Study 2 was to explore the construct of career success in the two groups of artists studied. Each participant responded to an open-ended question asking them to define career success. The responses for professional artists were content analysed using emergent coding with two coders. The codebook was later applied to the arts students' definitions. The majority of the themes could be grouped into four main categories: internal definitions; financial recognition definitions; contribution definitions; and non-financial recognition definitions. Only one third of the definition themes in the professional artists' and arts graduates' definitions of career success were categorised as relating to financial recognition. Responses within the financial recognition category also indicated that many of the artists aspired only to a regular subsistence level of arts income (although a small number of the arts graduates did aspire to fame and fortune). The second section of the study investigated the statistical relationships between the five different measures of career success for each career success definitional category and overall. The professional artists' and arts graduates' surveys contained several measures of career success, including total earnings over the previous 12 months, arts earnings over the previous 12 months, 1-6 self-rated total employability, 1-6 self-rated arts employability, and 1-6 self-rated self-defined career success. All of the measures were found to be statistically related to one another, but a very strong statistical relationship was identified between each employability measure and its corresponding earnings measure for both of the samples. Consequently, it was decided to include only the earnings measures (earnings from arts, and earnings overall) and the self-defined career success rating measure in the later studies. Study 3 used the career development constructs validated in Study 1, sociodemographic variables, and the career success measures explored in Study 2 via Classification and Regression Tree (CART - Breiman, Friedman, Olshen, & Stone, 1984) style decision trees with v-fold crossvalidation pruning using the 1 SE rule. CART decision trees are a nonparametric analysis technique which can be used as an alternative to OLS or hierarchical regression in the case of data which violates parametric statistical assumptions. The three optimal decision trees for total earnings, arts earnings and self defined career success ratings explained a large proportion of the variance in their respective target variables (R2 between 0.49 and 0.68). The Career building subscale of the Career Management Competence scale, pertaining to the ability to manage the external aspects of a career, was the most consistent predictor of all three career success measures (and was the strongest predictor for two of the three trees), indicating the importance of the artists' abilities to secure work and build the external aspects of a career. Other important predictors included the Self management subscale of the Career Management Competence scale, Protean Career Success Orientation, length of time working in the arts, and the positive role of interpersonal influences, skills and abilities, and interests and beliefs from the Career Development Influences scale. Slightly different patterns of predictors were found for the three different career success measures. Study 4 also involved the career development constructs validated in Study 1, sociodemographic variables, and the career success measures explored in Study 2 via CART style decision trees. This study used a prospective repeated measures design where the data for the attribute variables were gathered at the point of undergraduate course completion, and the target variables were measured one year later. Data from a total of 122 arts students were used, as 122 of the 218 students who responded to the survey at time 1 (October 2005) also responded at time 2 (October 2006). The resulting optimal decision trees had R2 values of between 0.33 and 0.46. The values were lower than those for the professional artists' decision trees, and the trees themselves were smaller, but the R2 values nonetheless indicated that the arts students' trees possessed satisfactory explanatory power. The arts graduates' Career building scores at time 1 were strongly predictive of all three career success measures at time 2, a similar finding to the professional artists' trees. A further similarity between the trees for the two samples was the strong statistical relationship between Career building, Self management, and Protean Career Success Orientation. However, the most important variable in the total earnings tree was arts discipline category. Technical / design arts graduates consistently earned more overall than arts graduates from other disciplines. Other key predictors in the arts graduates' trees were work experience in arts prior to course completion, positive interpersonal influences, and the positive influence of skills and abilities and interests and beliefs on career development. The research program findings represent significant contributions to existing knowledge about artists' career development and success, and also the transition from higher education to the world of work, with specific reference to arts and creative industries programs. It also has implications for theory relating to career success and protean / boundaryless careers.

Los estilos APA, Harvard, Vancouver, ISO, etc.

26

Cagnini, Henry Emanuel Leal. "Estimation of distribution algorithms for clustering and classification". Pontif?cia Universidade Cat?lica do Rio Grande do Sul, 2017. http://tede2.pucrs.br/tede2/handle/tede/7384.

Texto completo

Resumen

Submitted by Caroline Xavier (caroline.xavier@pucrs.br) on 2017-06-29T11:51:00Z No. of bitstreams: 1 DIS_HENRY_EMANUEL_LEAL_CAGNINI_COMPLETO.pdf: 3650909 bytes, checksum: 55d52061a10460875dba677a9812fe9c (MD5)
Made available in DSpace on 2017-06-29T11:51:00Z (GMT). No. of bitstreams: 1 DIS_HENRY_EMANUEL_LEAL_CAGNINI_COMPLETO.pdf: 3650909 bytes, checksum: 55d52061a10460875dba677a9812fe9c (MD5) Previous issue date: 2017-03-20
Extrair informa??es relevantes a partir de dados n?o ? uma tarefa f?cil. Tais dados podem vir a partir de lotes ou em fluxos cont?nuos, podem ser completos ou possuir partes faltantes, podem ser duplicados, e tamb?m podem ser ruidosos. Ademais, existem diversos algoritmos que realizam tarefas de minera??o de dados e, segundo o teorema do "Almo?o Gr?tis", n?o existe apenas um algoritmo que venha a solucionar satisfatoriamente todos os poss?veis problemas. Como um obst?culo final, algoritmos geralmente necessitam que hiper-par?metros sejam definidos, o que n?o surpreendentemente demanda um m?nimo de conhecimento sobre o dom?nio da aplica??o para que tais par?metros sejam corretamente definidos. J? que v?rios algoritmos tradicionais empregam estrat?gias de busca local gulosas, realizar um ajuste fino sobre estes hiper-par?metros se torna uma etapa crucial a fim de obter modelos preditivos de qualidade superior. Por outro lado, Algoritmos de Estimativa de Distribui??o realizam uma busca global, geralmente mais eficiente que realizar uma buscam exaustiva sobre todas as poss?veis solu??es para um determinado problema. Valendo-se de uma fun??o de aptid?o, algoritmos de estimativa de distribui??o ir?o iterativamente procurar por melhores solu??es durante seu processo evolutivo. Baseado nos benef?cios que o emprego de algoritmos de estimativa de distribui??o podem oferecer para as tarefas de agrupamento e indu??o de ?rvores de decis?o, duas tarefas de minera??o de dados consideradas NP-dif?cil e NP-dif?cil/completo respectivamente, este trabalho visa desenvolver novos algoritmos de estimativa de distribui??o a fim de obter melhores resultados em rela??o a m?todos tradicionais que empregam estrat?gias de busca local gulosas, e tamb?m sobre outros algoritmos evolutivos.
Extracting meaningful information from data is not an easy task. Data can come in batches or through a continuous stream, and can be incomplete or complete, duplicated, or noisy. Moreover, there are several algorithms to perform data mining tasks, and the no-free lunch theorem states that there is not a single best algorithm for all problems. As a final obstacle, algorithms usually require hyperparameters to be set in order to operate, which not surprisingly often demand a minimum knowledge of the application domain to be fine-tuned. Since many traditional data mining algorithms employ a greedy local search strategy, fine-tuning is a crucial step towards achieving better predictive models. On the other hand, Estimation of Distribution Algorithms perform a global search, which often is more efficient than performing a wide search through the set of possible parameters. By using a quality function, estimation of distribution algorithms will iteratively seek better solutions throughout its evolutionary process. Based on the benefits that estimation of distribution algorithms may offer to clustering and decision tree-induction, two data mining tasks considered to be NP-hard and NPhard/ complete, respectively, this works aims at developing novel algorithms in order to obtain better results than traditional, greedy algorithms and baseline evolutionary approaches.

Los estilos APA, Harvard, Vancouver, ISO, etc.

27

Baker, Peter John. "Applied Bayesian modelling in genetics". Thesis, Queensland University of Technology, 2001.

Buscar texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

28

Covi, Patrick. "Multi-hazard analysis of steel structures subjected to fire following earthquake". Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/313383.

Texto completo

Resumen

Fires following earthquake (FFE) have historically produced enormous post-earthquake damage and losses in terms of lives, buildings and economic costs, like the San Francisco earthquake (1906), the Kobe earthquake (1995), the Turkey earthquake (2011), the Tohoku earthquake (2011) and the Christchurch earthquakes (2011). The structural fire performance can worsen significantly because the fire acts on a structure damaged by the seismic event. On these premises, the purpose of this work is the investigation of the experimental and numerical response of structural and non-structural components of steel structures subjected to fire following earthquake (FFE) to increase the knowledge and provide a robust framework for hybrid fire testing and hybrid fire following earthquake testing. A partitioned algorithm to test a real case study with substructuring techniques was developed. The framework is developed in MATLAB and it is also based on the implementation of nonlinear finite elements to model the effects of earthquake forces and post-earthquake effects such as fire and thermal loads on structures. These elements should be able to capture geometrical and mechanical non-linearities to deal with large displacements. Two numerical validation procedures of the partitioned algorithm simulating two virtual hybrid fire testing and one virtual hybrid seismic testing were carried out. Two sets of experimental tests in two different laboratories were performed to provide valuable data for the calibration and comparison of numerical finite element case studies reproducing the conditions used in the tests. Another goal of this thesis is to develop a fire following earthquake numerical framework based on a modified version of the OpenSees software and several scripts developed in MATLAB to perform probabilistic analyses of structures subjected to FFE. A new material class, namely SteelFFEThermal, was implemented to simulate the steel behaviour subjected to FFE events.

Los estilos APA, Harvard, Vancouver, ISO, etc.

29

Covi, Patrick. "Multi-hazard analysis of steel structures subjected to fire following earthquake". Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/313383.

Texto completo

Resumen

Fires following earthquake (FFE) have historically produced enormous post-earthquake damage and losses in terms of lives, buildings and economic costs, like the San Francisco earthquake (1906), the Kobe earthquake (1995), the Turkey earthquake (2011), the Tohoku earthquake (2011) and the Christchurch earthquakes (2011). The structural fire performance can worsen significantly because the fire acts on a structure damaged by the seismic event. On these premises, the purpose of this work is the investigation of the experimental and numerical response of structural and non-structural components of steel structures subjected to fire following earthquake (FFE) to increase the knowledge and provide a robust framework for hybrid fire testing and hybrid fire following earthquake testing. A partitioned algorithm to test a real case study with substructuring techniques was developed. The framework is developed in MATLAB and it is also based on the implementation of nonlinear finite elements to model the effects of earthquake forces and post-earthquake effects such as fire and thermal loads on structures. These elements should be able to capture geometrical and mechanical non-linearities to deal with large displacements. Two numerical validation procedures of the partitioned algorithm simulating two virtual hybrid fire testing and one virtual hybrid seismic testing were carried out. Two sets of experimental tests in two different laboratories were performed to provide valuable data for the calibration and comparison of numerical finite element case studies reproducing the conditions used in the tests. Another goal of this thesis is to develop a fire following earthquake numerical framework based on a modified version of the OpenSees software and several scripts developed in MATLAB to perform probabilistic analyses of structures subjected to FFE. A new material class, namely SteelFFEThermal, was implemented to simulate the steel behaviour subjected to FFE events.

Los estilos APA, Harvard, Vancouver, ISO, etc.

30

Juozenaite, Ineta. "Application of machine learning techniques for solving real world business problems : the case study - target marketing of insurance policies". Master's thesis, 2018. http://hdl.handle.net/10362/32410.

Texto completo

Resumen

Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence
The concept of machine learning has been around for decades, but now it is becoming more and more popular not only in the business, but everywhere else as well. It is because of increased amount of data, cheaper data storage, more powerful and affordable computational processing. The complexity of business environment leads companies to use data-driven decision making to work more efficiently. The most common machine learning methods, like Logistic Regression, Decision Tree, Artificial Neural Network and Support Vector Machine, with their applications are reviewed in this work. Insurance industry has one of the most competitive business environment and as a result, the use of machine learning techniques is growing in this industry. In this work, above mentioned machine learning methods are used to build predictive model for target marketing campaign of caravan insurance policies to achieve greater profitability. Information Gain and Chi-squared metrics, Regression Stepwise, R package “Boruta”, Spearman correlation analysis, distribution graphs by target variable, as well as basic statistics of all variables are used for feature selection. To solve this real-world business problem, the best final chosen predictive model is Multilayer Perceptron with backpropagation learning algorithm with 1 hidden layer and 12 hidden neurons.

Los estilos APA, Harvard, Vancouver, ISO, etc.

31

Chiu, Chun-Chieh y 邱俊傑. "CUDT: A CUDA Based Decision Tree Algorithm". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/88189185018035112843.

Texto completo

Resumen

碩士
國立交通大學
資訊科學與工程研究所
99
Classification is an important issue both in Machine Learning and Data Mining. Decision tree is one of the famous classification models. In the reality case, the dimension of data is high and the data size is huge. Building a decision in large data base cost much time in computation. It is a computationally expensive problem. GPU is a special design processor of graphic. The highly parallel features of graphic processing made today’s GPU architecture. GPGPU means use GPU to solve non-graphic problems which need amounts of computation power. Since the high performance and capacity/price ratio, many researches use GPU to process lots computation. Compute Unified Device Architecture (CUDA) is a GPGPU solution provided by NVIDIA. This paper provides a new parallel decision tree algorithm base on CUDA. The algorithm parallel computes building phase of decision tree. In our system, CPU is responsible for flow control and GPU is responsible for computation. We compare our system to the Weka-j48 algorithm. The result shows out system is 6~5x times faster than Weka-j48. Compare with SPRINT on large data set, our CUDT has about 18 times speedup.

Los estilos APA, Harvard, Vancouver, ISO, etc.

32

Lai, Jian-Cheng y 賴建丞. "Fast Quad-Tree Depth Decision Algorithm for HEVC Coding Tree Block". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/39ucm4.

Texto completo

Resumen

碩士
國立虎尾科技大學
資訊工程研究所
102
High Efficiency Video Coding (HEVC) is recently developed for ultra high definition video compression technique, which provides a higher compression ratio and throughput compared with previously video compression standard H.264/AVC. Therefore, this technique is widely used to limited bandwidth network transmission and confined storage space. In order to obtain the higher compression ratio and maintain video quality, which provides variable block partition and mode prediction for HEVC encoder. If each block is computed during the mode decision process, a lot of encoding time is consumed. It makes limiting the applicability in real time for HEVC. Hence, there are many fast algorithms proposed to eliminate the block partition or mode prediction. In natural videos, the neighbor blocks have high correlation with current block, by which the reference block method is studied to terminate or eliminate the block or mode prediction. This method uses the lower computation of mode reduction to obtain a best compression ratio and time saving. Therefore, that is widely proposed for HEVC fast algorithm. On the other hand, the non-reference method has been proposed by extracting the feature of video frames. But the non-reference method predict the terminated condition. This thesis, proposes two quad-tree depth decision methods : one is the reference method and the other one non-reference method for depth-correlation and edge strength detection method, respectively. In reference block method, we find the correlation of up to 90% correlation with the co-located coding tree block (CTB) in the previous frame. Therefore, we use the co-located CTB depth information to limit the depth partition of CTB. Different from the previously proposed method, the proposed method adopts the extension of partition depth by one level. But it is poor prediction in fast moving object sequence or change scene. The fast moving and changing scenes are lower correlation between frames. Based on aforementioned disadvantage, the edge strength detection method is proposed to detect the structure variation of CTB to predict the encoded depth. Since this method does not require the reference to neighbor block, a better prediction with variation video sequence can be obtained. But it makes the poor prediction for unobvious edge video. For example, in dark videos, the edge are not obvious and the proposed algorithm makes the poor prediction of depth level. Finally, the proposed fast methods are implemented in HM 10.1 model to demonstrate the efficiency of our algorithm. The proposed edge density detection method can obtain 23.1% of time savings with BD-bitrate close to 0.28% on average and depth-correlation method can provide about 21.1% of time savings and BD-bitrate increase of 0.17% on average.

Los estilos APA, Harvard, Vancouver, ISO, etc.

33

蔡智政. "Causing factor analysis of low-yield wafer using CART decision tree and data visualization". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/86362726149609751227.

Texto completo

Resumen

碩士
元智大學
工業工程與管理學系
90
Everyday, there are many production data in the corporation. It is difficult that get any information by analyzing production data rapidly in the situation, which fill up question, variable and competition everywhere. The people who manages the production can’t read and analysis the data which is large amount and variable immediately. Yield analysis is critical for IC manufacturing since yield is directly related to production cast and competitively in the market. To monitor the manufacturing process and product quality, manufacturing data are automatically collected during wafer fabrication. Engineers use these data to select possible causing factors of low-yield wafer and employ statistical techniques (e.g. design of experiment) to verity the hypothesis. This approach, however, is difficult due to the large amount of parameters (from hundreds to thousands) and the complicated interactions among them. This research employs CART decision tree to help engineers select the possible causing factors if low-yield wafers. The input data is the lot in-process control (LPC) data, which are recorded by metrology machine during wafer fabrication. In addition to the Gini index, two indexes are developed for tree node splitting. Decision trees generated using these tree node splitting criteria provides engineers useful rules to analysis the causing factors of low-yield wafers.

Los estilos APA, Harvard, Vancouver, ISO, etc.

34

Hsieh, Cheng-Hao y 謝正豪. "An Approach with Bat Algorithm and Decision Tree". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/38211713696274285313.

Texto completo

Resumen

碩士
華梵大學
資訊管理學系碩士班
102
The amount of information is rapidly increasing. Data mining is widely used. Decision tree can process data and provide the rules of tree structure, so that decision-maker can rapidly obtain information behind the hidden data. Therefore, decision tree can be used to solve problems from other domains. It should set the parameters before using decision tree, because how to properly set the parameters will affect the results. It is an important issue to adjust the parameters of decision tree. The different problems have different best parameters of decision tree. It spends a lot of time to adjust the parameters manually. Therefore, the thesis uses bat algorithm to adjust the parameters of decision tree. It can ameliorate the original accuracy of decision tree and find the best combination of parameters. After combining the algorithm, it will find the best combination of parameters for different datasets and generate the results. To compare the results with decision tree and support vector machine, the proposed algorithm can promote the accuracy of the original decision tree and has better results than support vector machine. Therefore, the bat algorithm can find the most appropriate parameters of decision tree in different problems.

Los estilos APA, Harvard, Vancouver, ISO, etc.

35

Hsu, Wen-Pao y 許文寶. "A Study on Market Segmentation of Information Product Marketing Channels Using the CART Decision Tree". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/63308488403561060992.

Texto completo

Resumen

碩士
淡江大學
企業管理學系碩士在職專班
96
As Taiwan enters the era of being an information society, marketing channels for information products on this densely populated island are gradually changing. In recent years, conventional marketing channels for information products — which mainly include localized distribution channels and centralized shopping districts — have been replaced by 3C (computers, communication products and consumer electronics) chain stores. However, customer segments in the market of information products are greatly diversified. Products and services generally required by medium-size businesses and group corporations, for example, are not available in these 3C chain stores. In other words, information products for businesses are sold through particular channels, which in Taiwan are distributed in metropolitan areas. Therefore, brand companies of information products not only have to exercise the power of their brand value, but should also pay attention to the management of channel value to survive heated competition from peers and relevant industries. This thesis has the following three objectives: (1) to integrate the distributor databases of brand companies and, using the data mining technique and statistical analysis tools, group the distributors according to their values; (2) to mine the procurement characteristics of local distributors, develop potential distributors and confirm their values according to the theory of market segmentation and the results of decision tree analysis; and (3) to study how brand companies can achieve resource centralization, maximization of efficiency, and discuss differentiated marketing strategies and the enhancement of competitiveness in channel management for distributors of different values according to the localized characteristics. In this study, the data mining technique is applied to case companies for validation. Customers are divided into groups by first determining the RFM (Recency, Frequency, Monetary value) model attributes and then performing a K-means cluster analysis, so that the grouping results match the localized characteristics and the RFM classification variables. The attributes and characteristics of target distributors are then identified according to the classification results obtained by using the CART (Classification and Regression Tree) technique. After analyzing data obtained from the industry and the validation results, the differences among customer types are analyzed under two themes: “analysis of customer value” and “analysis of customer classification characteristics.” Finally, appropriate marketing strategies are proposed for the respective case companies with reference to the characteristics of different customer types.

Los estilos APA, Harvard, Vancouver, ISO, etc.

36

Tsai, Yu-Ju y 蔡育儒. "Paralleled CHAID Decision Tree Algorithm with Big-Data Capability". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/08393327566226485549.

Texto completo

Resumen

碩士
淡江大學
統計學系碩士班
102
As technology advances, the era of Big-Data has finally arrived. As the amount of data increases , the improvement of computing speed becomes an important development technology. If data training and analysis time are reduced, we could make the prediction or decision much earlier then expected. As a result, parallel computation is one of the methods which can reduce the analysis time. In this paper, we rewrite the CHAID decision tree algorithm for parallel computation and Big-Data capability. Our simulation results show that, when the CPU has more than one kernel, the computation time of our improved CHAID tree is significantly reduced. When we have a huge amount of data, the difference of computation times is even more significant.

Los estilos APA, Harvard, Vancouver, ISO, etc.

37

Lanning, James Michael. "A kernelized genetic algorithm decision tree with information criteria". 2008. http://etd.utk.edu/2008/August2008Dissertations/LanningJamesMichael.pdf.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

38

Tu, Tsung-kai y 涂宗楷. "Analysis of Optimized Operation of Water Chiller by Using Data Envelopment Analysis and CART Decision Tree". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/fd39g8.

Texto completo

Resumen

碩士
國立臺北科技大學
能源與冷凍空調工程系碩士班
105
With the rapid development of information technology, how to effectively and correctly analyze the massive data has become an important issue. Data mining for the use of a database analysis method, from a number of data extracted from the implied, the past is not known, credible and effective knowledge. CART algorithm analysis has been successfully applied in many fields such as data exploration, text classification, image recognition, biological information, and etc., can use the CART classification model to achieve a very prominent effect. Therefore, this study first analyzes the running data recorded by the central monitoring system of the ice water main equipment of the chiller in Hsinchu. Then uses the data envelopment analysis method as the tool to discriminate the operation point of the chiller. And use CART algorithm to analyze the influence of the specific operating parameters on the chiller efficiency. To find a relatively efficient operating point, and improve the efficiency of the inefficient operating point. As operational optimization of the objective reference, and confirmed that the CART classification tree model has a certain effect. The results calculated by DEA show that the average technical efficiency falls between 90-95%, indicating that the operating conditions of the air conditioning system are relatively effective for most of the time. Then by the CART generated classification tree, to sum up, the relatively effective mode of operation. According to the results, show that, you do not need any additional cost and without affecting the quality of the original air conditioning, save 1-3% of energy consumption. To achieve the purpose of energy saving and carbon reduction.

Los estilos APA, Harvard, Vancouver, ISO, etc.

39

Shiau, Fang-Jr y 蕭方智. "Developing a Hierarchical Particle Swarm based Fuzzy Decision Tree Algorithm". Thesis, 2005. http://ndltd.ncl.edu.tw/handle/38520035342800327252.

Texto completo

Resumen

碩士
元智大學
工業工程與管理學系
93
Decision tree is one of most common techniques for classification problems in data mining. Recently, fuzzy set theory has been applied to decision tree construction to improve its performance. However, how to design flexile fuzzy membership functions for each attribute and how to reduce the total number of rules and improve the classification interpretability are two major concerns. To solve the problems, this research proposes a hieratical particle swarm optimization to develop a fuzzy decision tree algorithm (HPS-FDT). In this proposed HPS-FDT algorithm, all particles are encoded using a hieratical approach to improve the efficiency of solution search. The developed HPS-FDT builds a decision tree to achieve: (1) Maximize the classification accuracy, (2) Minimize the number of rules and (3) Minimize the number of attributes and membership functions. Through a serious of benchmark data validation, the proposed HPS-FDT algorithm shows the high performance for several classification problems. In addition, the proposed HPS-FDT algorithm is tested using a mutual fund dataset provided by an internet bank to show the real world implementation possiblility. With the results, managers can make a better marketing strategy for specific target customers.

Los estilos APA, Harvard, Vancouver, ISO, etc.

40

Chang, Tien-Wei y 張添瑋. "Using Expert Decision Tree Algorithm in Computer Assisted Testing System". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/5qd4ng.

Texto completo

Resumen

碩士
國立嘉義大學
資訊工程學系研究所
97
Since different person has different learning method that suits him/her, the adaptive learning is hard to be reached through the existed single feedback assessment mechanism. This thesis integrated a set of expert decision tree algorithm to build up the regularity between learning type and teaching strategy. The regularity is embedded into the computer assisted system to set up new diversified teaching assessment system. Experts’ teaching experiments and KLSI Inventory are used to set up the rules of learning type knowledge database in the system. Based on the learning type knowledge database, the system will judge learners’ strength in each stage of the learning period, and divided the learners into four learning types. Then based on the response from teaching material and assessment result, the learning strategy suitable for different types of learners is analyzed. After revise the knowledge regularity correlation, the system will propose a suggestion of learning strategy for the learners to enhance learning effectiveness.

Los estilos APA, Harvard, Vancouver, ISO, etc.

41

Guo, Jin-Jhong y 郭金忠. "Medical and Biological Datasets Using Decision Tree and Genetic Algorithm". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/16671173991554268590.

Texto completo

Resumen

碩士
國立雲林科技大學
工業工程與管理研究所碩士班
94
This study describes the application of decision tree and genetic algorithm (GA) to evolve useful subset of discriminatory features for medical and biological data. We have shown that a GA combined with decision tree performs well in extracting information about the relationship between biological variables. The result suggests that six feature set evolutions : Aspartate aminotransferase (GOT)、HBV Surface Ag (HBs-Ag)、Blood urea nitrogen (BUN)、Globulin (Glo)、White blood cell count(WBC)、Electrolyte Ca(Ca) are appropriate for our large dataset. Those feature sets represent the relationship between biological variables and provide useful information for clinical diagnosis.

Los estilos APA, Harvard, Vancouver, ISO, etc.

42

Lin, Yi-Kun y 林義坤. "Apply Decision Tree Algorithm to the Analysis of Web Log". Thesis, 2011. http://ndltd.ncl.edu.tw/handle/17894208644016211291.

Texto completo

Resumen

碩士
華梵大學
資訊管理學系碩士班
99
With the rapid development of Internet and network services increased, the network has imperceptibly become an important tool in our lives, which makes network security events be endless. In these events, the reasons of many attacked websites by hackers are most of web application programmers who have no awareness of network security. Then, vulnerabilities are existed in websites such as Cross Site Request Forgeries (CSRF), SQL Injection, Script Insertion, and Cross Site Scripting (XSS). The attacked websites would lead to serious damages of data leakage for personal information or confidential business information. From this viewpoint, it is important to find an effective security defense strategy for attacks. This research will simulate the environment for the attack of hacker on a virtual machine, and then implement their attack methods. To establish a dataset by the web log collected from the attacks and then use the algorithm of decision tree to analyze and determine whether a web page is subject to malicious attacks or not. There are 7 obtained rules from experiment results. They can be used to analyze web log and provide the decision supports for the network administrator. Keywords: SQL Injection, Cross Site Scripting, Web Log, Decision Tree Algorithm

Los estilos APA, Harvard, Vancouver, ISO, etc.

43

Yu, Li y 游力. "Dynamic Ensemble Decision Tree Learning Algorithm for Network Traffic Classification". Thesis, 2016. http://ndltd.ncl.edu.tw/handle/cyqff7.

Texto completo

Resumen

碩士
國立交通大學
網路工程研究所
104
Network traffic classification has already been discussed for decades, which gives us the ability to monitor and detect the applications associated with network traffic. It becomes the essential step of network management and traffic engineering such as QoS control, abnormal detection and ISPs network planing. From the earliest approach, which is the port base classification, to the state of the art practice, which is the machine learning classification. Beside, most of the information technology research and advisory organizations have forecast that we are going to enter the era of big data. We would face the high volume, high velocity and high variety data. And machine learning approach traffic classification has satisfying accuracy with lower computing resources, which meet the requirement of high volume and high velocity of big data. However, most of machine learning based traffic classification researches assume the network environment is stable, which is not true. This assumption makes the classifiers unable to deal with highly variety data, since they do not have the countermeasure of the changes of network environment. In order to address the issue, we proposed the dynamic ensemble decision tree learning algorithm or EDT. Our EDT is able to dynamically update its predicting model without retraining whole model all over again. In the experiment, The testing data are collected in our experimental LTE network. Evaluation shows our algorithm can respond to the new application 24 times faster in average than the original C5.0 decision tree learning algorithm without losing more than 1.02% accuracy. The contribution of this thesis is we proposed a new model for decision tree, giving it the ability to dynamically adjust the model.

Los estilos APA, Harvard, Vancouver, ISO, etc.

44

LIN, SU CHING y 蘇清霖. "Building a salary predictive analysis model using the decision tree algorithm". Thesis, 2013. http://ndltd.ncl.edu.tw/handle/23503546290538682457.

Texto completo

Resumen

碩士
僑光科技大學
資訊科技研究所
101
With evolution of technology and vigorous development of Internet, people no longer just simply record information. Instead, they analyze such data to discover its potential usage for certain knowledge, or analyze the existing facts to produce result forecasting further information, which then contribute for society usage. In recent years, the emergence of the domestic job recruiter website utilizes the advantages of the Internet for job seekers, and provides convenience, interactive and richness of information, which constitutes a sharp contrast to the traditional print media. Therefore how to properly utilize huge amount of data from such website is significant important subject. This paper uses the curriculum vitae database of year 2010 offered by 1111 Job Bank; it contains two million applicants information, and with such amount of data, it is considerable and representative. Base on the education and work experience data that is provided by applicants on their curriculum vitae, using the decision tree algorithm in data mining technology, the salary predictive analysis model is constructed. Using the model, applicants can predict the salary range based upon their qualifications, work experiences and the condition of the recruitment advertisements. This information then can be used by applicants, when applying jobs, as a reference for choosing jobs and negotiating salary in the interview.

Los estilos APA, Harvard, Vancouver, ISO, etc.

45

Wang, Yuang-Jang y 王元璋. "A fuzzy decision tree with fuzzy entropy turned by genetic algorithm". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/25559225344644267556.

Texto completo

Resumen

碩士
國立成功大學
資訊管理研究所
94
The decision tree is one of data mining methodologies. It generates rules via learning to provide back-end information to decision support systems (DSS) or executive information system (EIS) for decision making. Regular decision trees lack of ability to deal with uncertainty. Fuzzy decision trees handle uncertainty using linguistic variable with adjustable membership functions. The membership function of fuzzy decision tree can adapt itself to various situation for gaining decision accuracy. 　　This study builds two kinds of classification model. The first type is based on genetic algorithms. It uses a real number type genetic algorithm and a fitness function, which is easy to evaluate. The second type uses fuzzy decision tree model based on fuzzy entropy of Janikow and entropy of Quinlan to perform algorithm. This model searches the best membership function of fuzzy decision trees using genetic algorithm. In the study, three linguistics and two linguistics are used to form the fuzzy decision tree depending on the problems encountered. 　　UCI-ML is used as the research database. The study shows that both of the genetic algorithm and fuzzy decision tree have better rates of accuracy than those of Naïve Bayesian and C4.5 Based decision tree. The advantage of the proposed fuzzy decision tree is that the decision boundary can be adjusted to improve accuracy using domain knowledge of managers.

Los estilos APA, Harvard, Vancouver, ISO, etc.

46

Chen, Ming-Shiang y 陳明祥. "Predicting Student Deviation Behavior By Hybrid Genetic Algorithm/Decision Tree Approach". Thesis, 2006. http://ndltd.ncl.edu.tw/handle/75138559893452241385.

Texto completo

Resumen

碩士
華梵大學
資訊管理學系碩士班
94
Abstract Because of the changes of the times, the multiple-society, the pressure of school, family, and individual factor, etc., many students' deviant behaviors take place more often. Once students' deviant behaviors happen, it may be difficult to compensate the great impact on individual or group. This research by utilizing superior information technology that uses a hybrid Genetic algorithm (GA) & Decision trees (DT) approach, to predict if the student has a tendency toward deviance. The experimental results demonstrated：in 116 attributes, gender, whether there is go to school late is as well as has the conflict with his father etc are the criteria to distinct if the student has a tendency toward deviant behavior, Although there are partly affected facts under included in the asked items (attributes) of the questionnaire survey project, evaluated the averaged accurate rates of this research predicting whether students have a tendency toward deviance is still reaches 86%. If so, This research provides an new idea to educational counseling personnel, the method of forecasts the student deviation behavior tendency not only by the traditional statistics approach, but also the hybrid GA/DT-based knowledge learning mechanism similarly can provide more richer, clear and fine rule to forecast the student deviation behavior is what kind of type, as to the utilization for the correlation educational personnel, Besides, if can skillfully improve the latent influence factor in the rule about the student , simultaneously coordinate use the methods of counseling and guidance to prevent that from happening. In this way, we should be possible effectively to reduce the formation rate of the student deviation behavior, and help limited tutor manpower to effectively exert the approaches to guidance when they do confront over thousands of students, the counseling work still could display the good result. Keywords：Deviant behavior, Genetic algorithm, Decision trees

Los estilos APA, Harvard, Vancouver, ISO, etc.

47

LAI, I.-CHIEN y 賴以建. "GENETIC ALGORITHM, NEURAL NETWORK AND DECISION TREE IN PRE-WARNING MODELS". Thesis, 2003. http://ndltd.ncl.edu.tw/handle/34939994824622402263.

Texto completo

Resumen

碩士
國立臺北大學
企業管理學系
91
In the past, the pre-warning models for Financial Crisis are usually established based on traditional statistical methods such as Discriminant Analysis. However, it is often questionable whether the financial data satisfies the assumptions of such models. Therefore, this study investigates the construction of pre-warning model through nonlinear methods such as Genetic Algorithm and Neural Network. In additional, since the reference value for the key indicator that influences business failure most cannot be extracted from the pre-warning model, this study starts with using Decision Tree technique to extract this reference value. Based upon this, the objectives of this thesis include the following: 1.Identify the chromosome that influences business failure most through Genetic Algorithm’s strong searching capability. 2.Construct financial pre-warning models from Neural Network and traditional Discriminant Analysis techniques, and evaluate their pre-warning performance by comparing the ability to predict business failure three years before its occurrence. 3.Extract the reference value and the key descriptive indicator that influences business failure most through Decision Tree technique, thus enabling the investing public and associated authority to constantly monitor the key financial factors. The main characteristic of the Genetic Algorithm used in this study is its massive parallel optimizing ability. The analyses on the actual data show that: 1.Identify the Genetic component (chromosome) that influences business failure most through Genetic Algorithm: after 500 generations, the optimal chromosome combinations are Operating Income Ratio, Sales per Share, Earnings before Interest/Equity, Net Present Value per Stock (A), Net Present Value per Stock (B), Retained Profit Ratio, Cash Flow Adequacy Ratio, Times Interest Earned, Fixed Asset Turnover Ratio, and Operating Expense Ratio. 2.By employing the key indicators obtained from Genetic Algorithm, both Neural Network model and Discriminant Analysis model can accurately predict business failure (on average, for three-year ago prediction, hit ratio: 0.9500 compared with 0.9055). The hit ratios for both models are the same (0.9667) for one-year ago prediction. However, the hit ratios for two- and three-year ago predictions are higher for Neural Network model (0.9500 and 0.9333 compared with 0.9166 and 0.8333). This indicates that Neural Network pre-warning model has higher probability to successfully predict business failure earlier. 3.The Decision Tree cannot effectively distinguish the samples of successful and failure business. The following results are observed. When the Retained Profit Ratio of a business is larger than 0.9931, the business failure rate is about 88%. When the Retained Profit Ratio is larger than 0.9931 and Operating Income Ratio is lower than 0.0098, the business failure rate is as high as 97.33%.

Los estilos APA, Harvard, Vancouver, ISO, etc.

48

Sheif, Kuo-yeith y 謝國義. "An improved algorithm to reduce the computational complexity in decision tree generation". Thesis, 1998. http://ndltd.ncl.edu.tw/handle/33259286892116275891.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

49

Fann, Wen-Chih y 范文誌. "Predicting mortality in patients with necrotizing fasciitis by using decision tree algorithm". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/51141977925238132643.

Texto completo

Resumen

碩士
臺北醫學大學
醫學資訊研究所
97
Objective: To identify simple admission clinical characteristics or laboratory tests to predict mortality but also differentiate between high and low mortality risk groups in patients with necrotizing fasciitis. Methods: This retrospective chart-review study included adult patients who were admitted to two hospitals through the emergency department with the discharge diagnosis of necrotizing fasciitis. Both the chi-square test (or the Mann-Whitney-U test) and C4.5 decision tree were utilized to analyze 23 variables among clinical characteristics and laboratory tests. The main outcome measure was in-hospital mortality. Results: 272 patients were included and the overall mortality rate was 17%. On univariate analysis, significant variables associated with mortality included liver cirrhosis, cancer, chronic kidney disease, adrenal insufficiency, hypotension, white blood cell (WBC) count, WBC band form, and hemoglobin. Three independent predictors of mortality - WBC count, WBC band form, and hypotension- were determined by means of the C4.5 decision tree. From these predictors, six decision rules were produced to classify patients with necrotizing fasciitis into high and low mortality risk groups. The accuracy of C4.5 decision tree with cross-validation was 84.2% (95% confidence interval, 80.3%-88.1%). Conclusions: By using routine blood pressure measurement and simple laboratory test, WBC count and differential, emergency physicians may rapidly identify patients with high mortality.

Los estilos APA, Harvard, Vancouver, ISO, etc.

50

TSOU, YU-CHIEH y 鄒侑捷. "Application of Decision Tree Algorithm to Item Selection Strategies in Multistage Testing". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/e97vsd.

Texto completo

Resumen

碩士
輔仁大學
統計資訊學系應用統計碩士班
107
The purpose of this study is to combination decision tree model and test theory and applies to the multi-stage test. The result indicate that the classification correct rate and rooted mean square error of decision tree selection, Fisher information and KL information method are better than random selection. Under the one-parameter model hypothesis, the cost of decision tree method, FI information method and KL information method is similar when the student’s ability is under uniform distribution hypothesis. However, when the student’s ability is under normal distribution hypothesis, the cost of the decision tree item select strategy is greater than other methods. Under the assumption of two-parameter model, the cost of decision tree method can be reduced and make the test more economical.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Tesis sobre el tema "Decision Tree with CART algorithm"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros