Dissertations / Theses: 'Auc-Roc'

1

Zheng, Shimin. "The ROC Curve and the Area under the Curve (AUC)." Digital Commons @ East Tennessee State University, 2017. https://dc.etsu.edu/etsu-works/139.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Lu, Qing. "Methods for Designing and Forming Predictive Genetic Tests." Case Western Reserve University School of Graduate Studies / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=case1212197560.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Yuan, Yan. "Empirical Likelihood-Based NonParametric Inference for the Difference between Two Partial AUCS." Digital Archive @ GSU, 2007. http://digitalarchive.gsu.edu/math_theses/32.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Compare the accuracy of two continuous-scale tests is increasing important when a new test is developed. The traditional approach that compares the entire areas under two Receiver Operating Characteristic (ROC) curves is not sensitive when two ROC curves cross each other. A better approach to compare the accuracy of two diagnostic tests is to compare the areas under two ROC curves (AUCs) in the interested specificity interval. In this thesis, we have proposed bootstrap and empirical likelihood (EL) approach for inference of the difference between two partial AUCs. The empirical likelihood ratio for the difference between two partial AUCs is defined and its limiting distribution is shown to be a scaled chi-square distribution. The EL based confidence intervals for the difference between two partial AUCs are obtained. Additionally we have conducted simulation studies to compare four proposed EL and bootstrap based intervals.

4

Huang, Xin. "Bootstrap and Empirical Likelihood-based Semi-parametric Inference for the Difference between Two Partial AUCs." Digital Archive @ GSU, 2008. http://digitalarchive.gsu.edu/math_theses/54.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

With new tests being developed and marketed, the comparison of the diagnostic accuracy of two continuous-scale diagnostic tests are of great importance. Comparing the partial areas under the receiver operating characteristic curves (pAUC) is an effective method to evaluate the accuracy of two diagnostic tests. In this thesis, we study the semi-parametric inference for the difference between two pAUCs. A normal approximation for the distribution of the difference between two pAUCs has been derived. The empirical likelihood ratio for the difference between two pAUCs is defined and its asymptotic distribution is shown to be a scaled chi-quare distribution. Bootstrap and empirical likelihood based inferential methods for the difference are proposed. We construct five confidence intervals for the difference between two pAUCs. Simulation studies are conducted to compare the finite sample performance of these intervals. We also use a real example as an application of our recommended intervals.

5

Sun, Fangfang. "Semi-parametric inference for the partial area under the ROC curve." unrestricted, 2008. http://etd.gsu.edu/theses/available/etd-11192008-113213/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Thesis (M.S.)--Georgia State University, 2008.
Title from file title page. Gengsheng Qin, committee chair; Yu-Sheng Hsu, Yixin Fang, Yuanhui Xiao, committee members. Description based on contents viewed July 22, 2009. Includes bibliographical references (p. 29-30).

6

Zhou, Haochuan. "Statistical Inferences for the Youden Index." Digital Archive @ GSU, 2011. http://digitalarchive.gsu.edu/math_diss/5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In diagnostic test studies, one crucial task is to evaluate the diagnostic accuracy of a test. Currently, most studies focus on the Receiver Operating Characteristics Curve and the Area Under the Curve. On the other hand, the Youden index, widely applied in practice, is another comprehensive measurement for the performance of a diagnostic test. For a continuous-scale test classifying diseased and non-diseased groups, finding the Youden index of the test is equivalent to maximize the sum of sensitivity and specificity for all the possible values of the cut-point. This dissertation concentrates on statistical inferences for the Youden index. First, an auxiliary tool for the Youden index, called the diagnostic curve, is defined and used to evaluate the diagnostic test. Second, in the paired-design study to assess the diagnostic accuracy of two biomarkers, the difference in paired Youden indices frequently acts as an evaluation standard. We propose an exact confidence interval for the difference in paired Youden indices based on generalized pivotal quantities. A maximum likelihood estimate-based interval and a bootstrap-based interval are also included in the study. Third, for certain diseases, an intermediate level exists between diseased and non-diseased status. With such concern, we define the Youden index for three ordinal groups, propose the empirical estimate of the Youden index, study the asymptotic properties of the empirical Youden index estimate, and construct parametric and nonparametric confidence intervals for the Youden index. Finally, since covariates often affect the accuracy of a diagnostic test, therefore, we propose estimates for the Youden index with a covariate adjustment under heteroscedastic regression models for the test results. Asymptotic properties of the covariate-adjusted Youden index estimators are investigated under normal error and non-normal error assumptions.

7

Xu, Ping. "Evaluation of Repeated Biomarkers: Non-parametric Comparison of Areas under the Receiver Operating Curve Between Correlated Groups Using an Optimal Weighting Scheme." Scholar Commons, 2012. http://scholarcommons.usf.edu/etd/4261.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Receiver Operating Characteristic (ROC) curves are often used to evaluate the prognostic performance of a continuous biomarker. In a previous research, a non-parametric ROC approach was introduced to compare two biomarkers with repeated measurements. An asymptotically normal statistic, which contains the subject-specific weights, was developed to estimate the areas under the ROC curve of biomarkers. Although two weighting schemes were suggested to be optimal when the within subject correlation is 1 or 0 by the previous study, the universal optimal weight was not determined. We modify this asymptotical statistic to compare AUCs between two correlated groups and propose a solution to weight optimization in non-parametric AUCs comparison to improve the efficiency of the estimator. It is demonstrated how the Lagrange multiplier can be used as a strategy for finding the weights which minimize the variance function subject to constraints. We show substantial gains of efficiency by using the novel weighting scheme when the correlation within group is high, the correlation between groups is high, and/or the disease incidence is small, which is the case for many longitudinal matched case-control studies. An illustrative example is presented to apply the proposed methodology to a thyroid function dataset. Simulation results suggest that the optimal weight performs well with a sample size as small as 50 per group.

8

Bitara, Matúš. "Srovnání heuristických a konvenčních statistických metod v data miningu." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2019. http://www.nusl.cz/ntk/nusl-400833.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The thesis deals with the comparison of conventional and heuristic methods in data mining used for binary classification. In the theoretical part, four different models are described. Model classification is demonstrated on simple examples. In the practical part, models are compared on real data. This part also consists of data cleaning, outliers removal, two different transformations and dimension reduction. In the last part methods used to quality testing of models are described.

9

Khamesipour, Alireza. "IMPROVED GENE PAIR BIOMARKERS FOR MICROARRAY DATA CLASSIFICATION." OpenSIUC, 2018. https://opensiuc.lib.siu.edu/dissertations/1573.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The Top Scoring Pair (TSP) classifier, based on the notion of relative ranking reversals in the expressions of two marker genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. We introduce the AUC-based TSP classifier, which is based on the Area Under the ROC (Receiver Operating Characteristic) Curve. The AUCTSP classifier works according to the same principle as TSP but differs from the latter in that the probabilities that determine the top scoring pair are computed based on the relative rankings of the two marker genes across all subjects as opposed to for each individual subject. Although the classification is still done on an individual subject basis, the generalization that the AUC-based probabilities provide during training yield an overall better and more stable classifier. Through extensive simulation results and case studies involving classification in ovarian, leukemia, colon, and breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative pivot genes. The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across {\em all} subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair.\\ We have also proposed the use of the AUC test statistic in order to reduce the computational cost of the TSP in selecting the most informative pair of genes for diagnosing a specific disease. We have proven the efficacy of our proposed method through case studies in ovarian, colon, leukemia, breast and prostate cancers and diffuse large b-cell lymphoma in selecting informative genes. We have compared the selected pairs, computational cost and running time and classification performance of a subset of differentially expressed genes selected based on the AUC probability with the original TSP in the aforementioned datasets. The reduce sized TSP has proven to dramatically reduce the computational cost and time complexity of selecting the top scoring pair of genes in comparison to the original TSP in all of the case studies without degrading the performance of the classifier. Using the AUC probability, we were able to reduce the computational cost and CPU running time of the TSP by 79\% and 84\% respectively on average in the tested case studies. In addition, the use of the AUC probability prior to applying the TSP tends to avoid the selection of genes that are not expressed (``pivot'' genes) due to the imposed condition. We have demonstrated through LOOCV and 5-fold cross validation that the reduce sized TSP and TSP have shown to perform approximately the same in terms of classification accuracy for smaller threshold values. In conclusion, we suggest the use of the AUC test statistic in reducing the size of the dataset for the extensions of the TSP method, e.g. the k-TSP and TST, in order to make these methods feasible and cost effective.

10

Wang, Binhuan. "Statistical Evaluation of Continuous-Scale Diagnostic Tests with Missing Data." Digital Archive @ GSU, 2012. http://digitalarchive.gsu.edu/math_diss/8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The receiver operating characteristic (ROC) curve methodology is the statistical methodology for assessment of the accuracy of diagnostics tests or bio-markers. Currently most widely used statistical methods for the inferences of ROC curves are complete-data based parametric, semi-parametric or nonparametric methods. However, these methods cannot be used in diagnostic applications with missing data. In practical situations, missing diagnostic data occur more commonly due to various reasons such as medical tests being too expensive, too time consuming or too invasive. This dissertation aims to develop new nonparametric statistical methods for evaluating the accuracy of diagnostic tests or biomarkers in the presence of missing data. Specifically, novel nonparametric statistical methods will be developed with different types of missing data for (i) the inference of the area under the ROC curve (AUC, which is a summary index for the diagnostic accuracy of the test) and (ii) the joint inference of the sensitivity and the specificity of a continuous-scale diagnostic test. In this dissertation, we will provide a general framework that combines the empirical likelihood and general estimation equations with nuisance parameters for the joint inferences of sensitivity and specificity with missing diagnostic data. The proposed methods will have sound theoretical properties. The theoretical development is challenging because the proposed profile log-empirical likelihood ratio statistics are not the standard sum of independent random variables. The new methods have the power of likelihood based approaches and jackknife method in ROC studies. Therefore, they are expected to be more robust, more accurate and less computationally intensive than existing methods in the evaluation of competing diagnostic tests.

11

Albakour, Subhy. "Stream-automl : automated machine learning overimbalanced data streams for bipartite ranking problems." Electronic Thesis or Diss., Institut polytechnique de Paris, 2024. http://www.theses.fr/2024IPPAT015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Malgré sa popularité dans la littérature scientifique, l’apprentissage en ligne doit encore concrétiser son utilité pratique dans les applications industrielles. Vu que l’apprentissage en ligne gère les flux incessants de données volumineuses, à haute vélocité et en évolution constante par conception, le marketing en ligne semble être le candidat favori pour que l’apprentissage en ligne fasse son entrée dans l’industrie. Dans ce contexte, l’état de l’art de l’apprentissage en ligne n’a qu’une utilité limitée, car il se concentre principalement sur les problèmes de classification, tandis que le classement biparti constitue une meilleure modélisation du problème de marketing en ligne. Récemment, la combinaison de l’apprentissage en continu et de l’apprentissage automatique automatisé, c’est-à-dire Stream-AutoML, attire davantage l’attention de la communauté scientifique. Cette thèse explore l’applicabilité de Stream-AutoML aux problèmes de classement biparti lorsque les données sont déséquilibrées. Nous commençons par développer un cadre pour exécuter et évaluer les pipelines Stream-AutoML. Ensuite, nous proposons un cadre pour calculer AUC-ROC de manière progressive, et pour introduire une décroissance exponentielle aux données. Nous proposons également un cadre pour la détection des dérives conceptuelles en utilisant AUC-ROC. Dans ce cadre, nous développons six tests statistiques pour les différences d’AUC-ROC avec des bornes théoriques pour les erreurs de type I et de type II. Enfin, nous proposons quatre générateurs de données qui enrichissent les cadres d’évaluation des détecteurs des dérives conceptuelles dans des environnements contrôlés. Les résultats ont montré que les méthodes proposées réduisent considérablement les ressources allouées à l’évaluation et détectent les dérives conceptuelles en ayant très peu de faux positifs. Ces contributions préparent le terrain pour que Stream-AutoML puisse résoudre des problèmes de classement biparti, et peuvent ensuite être exploités dans les applications de marketing en ligne. Des implémentations optimisées des méthodes proposées ont été développées et ont déjà été adoptées dans le produit de marketing en ligne d’IDAaaS
Despite its popularity in the scientific literature, stream learning has yet to substantiate its practical utility in industrial applications. Characterized by the incessant influx of high-velocity, voluminous, and dynamically changing data, online marketing seems to be the favorite candidate for stream learning to make its entry into the industry. In this context, state-of-theart stream learning is of little utility, as it mainly focuses on classification, while bipartite ranking constitutes better modeling of the problem of online marketing. Recently, the combination of stream learning and AutoML, i.e., Stream-AutoML, has been drawing more attention from the scientific community. This work investigates the applicability of Stream-AutoML to bipartite ranking problems when data is imbalanced. We commence by developing a framework to execute and evaluate Stream-AutoML pipelines of stream learning models. Then we propose a framework for computing AUC-ROC incrementally, as well as introducing exponential decay to serve as a forgetting mechanism. We also propose a framework for concept drift detection using AUC-ROC, for which we develop six statistical tests for differences in AUC-ROC with theoretical bounds of type I and type II errors. Finally, we propose four data generators that enrich the tool kit to evaluate concept drift detectors under controlled environments. Results have shown that the proposed methods reduce the resources allocated for evaluation considerably and detect concept drifts with very small false positives. These contributions prepare the field for Stream-AutoML to solve bipartite ranking problems, which can be then exploited in online marketing applications. Optimized implementations of the proposed methods were developed and have already been adopted in the online marketing product of IDAaaS

12

Yang, Hanfang. "Jackknife Emperical Likelihood Method and its Applications." Digital Archive @ GSU, 2012. http://digitalarchive.gsu.edu/math_diss/9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In this dissertation, we investigate jackknife empirical likelihood methods motivated by recent statistics research and other related fields. Computational intensity of empirical likelihood can be significantly reduced by using jackknife empirical likelihood methods without losing computational accuracy and stability. We demonstrate that proposed jackknife empirical likelihood methods are able to handle several challenging and open problems in terms of elegant asymptotic properties and accurate simulation result in finite samples. These interesting problems include ROC curves with missing data, the difference of two ROC curves in two dimensional correlated data, a novel inference for the partial AUC and the difference of two quantiles with one or two samples. In addition, empirical likelihood methodology can be successfully applied to the linear transformation model using adjusted estimation equations. The comprehensive simulation studies on coverage probabilities and average lengths for those topics demonstrate the proposed jackknife empirical likelihood methods have a good performance in finite samples under various settings. Moreover, some related and attractive real problems are studied to support our conclusions. In the end, we provide an extensive discussion about some interesting and feasible ideas based on our jackknife EL procedures for future studies.

13

Yu, Daoping. "Early Stopping of a Neural Network via the Receiver Operating Curve." Digital Commons @ East Tennessee State University, 2010. https://dc.etsu.edu/etd/1732.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis presents the area under the ROC (Receiver Operating Characteristics) curve, or abbreviated AUC, as an alternate measure for evaluating the predictive performance of ANNs (Artificial Neural Networks) classifiers. Conventionally, neural networks are trained to have total error converge to zero which may give rise to over-fitting problems. To ensure that they do not over fit the training data and then fail to generalize well in new data, it appears effective to stop training as early as possible once getting AUC sufficiently large via integrating ROC/AUC analysis into the training process. In order to reduce learning costs involving the imbalanced data set of the uneven class distribution, random sampling and k-means clustering are implemented to draw a smaller subset of representatives from the original training data set. Finally, the confidence interval for the AUC is estimated in a non-parametric approach.

14

Hansén, Jacob, and Axel Gustafsson. "A Study on Comparison Websites in the Airline Industry and Using CART Methods to Determine Key Parameters in Flight Search Conversion." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254309.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This bachelor thesis in applied mathematics and industrial engineering and management aimed to identify relationships between search parameters in flight comparison search engines and the exit conversion rate, while also investigating how the emergence of such comparison search engines has impacted the airline industry. To identify such relationships, several classification models were employed in conjunction with several sampling methods to produce a predictive model using the program R. To investigate the impact of the emergence of comparison websites, Porter's 5 forces and a SWOT - analysis were employed to analyze findings of a literature study and a qualitative interview. The classification models developed performed poorly with regards to several assessments metrics which suggested that there were little to no significance in the relationship between the search parameters investigated and exit conversion rate. Porter's 5 forces and the SWOT-analysis suggested that the competitive landscape of the airline industry has become more competitive and that airlines which do not manage to adapt to this changing market environment will experience decreasing profitability.
Detta kandidatexamensarbete inriktat på tillämpad matematik och industriell ekonomi syftade till att identifiera samband mellan sökparametrar från flygsökmotorer och konverteringsgraden för utträde till ett flygbolags hemsida, och samtidigt undersöka hur uppkomsten av flygsökmotorer har påverkat flygindustrin för flygbolag. För att identifiera sådana samband, tillämpades flera klassificeringsmodeller tillsammans med stickprovsmetoder för att bygga en predikativ modell i programmet R. För att undersöka påverkan av flygsökmotorer tillämpades Porters 5 krafter och SWOT-analys som teoretiska ramverk för att analysera information uppsamlad genom en litteraturstudie och en intervju. Klassificeringsmodellerna som byggdes presterade undermåligt med avseende på flera utvärderingsmått, vilket antydde att det fanns lite eller inget samband mellan de undersökta sökparametrarna och konverteringsgraden för utträde. Porters 5 krafter och SWOT-analysen visade att flygindustrin hade blivit mer konkurrensutsatt och att flygbolag som inte lyckas anpassa sig efter en omgivning i ändring kommer att uppleva minskande lönsamhet.

15

Mackových, Marek. "Analýza experimentálních EKG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2016. http://www.nusl.cz/ntk/nusl-241981.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis is focused on the analysis of experimental ECG records drawn up in isolated rabbit hearts and aims to describe changes in EKG caused by ischemia and left ventricular hypertrophy. It consists of a theoretical analysis of the problems in the evaluation of ECG during ischemia and hypertrophy, and describes an experimental ECG recording. Theoretical part is followed by a practical section which describes the method for calculating morphological parameters, followed by ROC analysis to evaluate their suitability for the classification of hypertrophy and at the end is focused on classification.

16

Pospíšil, Lukáš. "Analýza ROC křivek zvukových signálů a jejich srovnání." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2017. http://www.nusl.cz/ntk/nusl-316445.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis deals with oportunity of ROC curve usage in the description of methods that work with sound signals. Specifically, it focuses on ways of detecting of stress in speech signals. The detection itselfs is done in a range of frequencies of the sound signal. There is also a classifier designed using ROC curves that decides whether the input signal is stressed or not. The output of this thesis are findings gathered from analyses and also some recommendation based on those analyses.

17

Plch, Vít. "Detekce fibrilace síní v EKG." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-402125.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This diploma thesis deals with detection of atrial fibrillation from HRV, classification of Poincare map and in the end the divide into two groups, one with detected atrial fibrillation and one not. The result is the decision on which variables are statistically significant for the identification of atrial fibrillations and which are not, and classification of the ECG signals.

18

Li, Yi. "A Generalization of AUC to an Ordered Multi-Class Diagnosis and Application to Longitudinal Data Analysis on Intellectual Outcome in Pediatric Brain-Tumor Patients." Digital Archive @ GSU, 2009. http://digitalarchive.gsu.edu/math_diss/1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Receiver operating characteristic (ROC) curves have been widely used in evaluation of the goodness of the diagnostic method in many study fields, such as disease diagnosis in medicine. The area under the ROC curve (AUC) naturally became one of the most used variables in gauging the goodness of the diagnosis (Mossman, Somoza 1991). Since medical diagnosis often is not dichotomous, the ROC curve and AUC need to be generalized to a multi-dimensional case. The generalization of AUC to multi-class case has been studied by many researchers in the past decade. Most recently, Nakas & Yiannoutsos (2004) considered the ordered d classes ROC analysis by only considering the sensitivities of each class. Hence, their dimension is only d. Cha (2005) considered more types of mis-classification in the ordered multiple-class case, but reduced the dimension of Ferri, at.el. from d(d-1) to 2(d-1). In this dissertation we are trying to adjust and calculate the VUS for an ordered multipleclass with Cha’s 2(d-1)-dimension method. Our methodology of finding the VUS is introduced. We present the method of adjusting and calculating VUS and their statistical inferences for the 2(d-1)-dimension. Some simulation results are included and a real example will be presented. Intellectual outcomes in pediatric brain-tumor patients were investigated in a prospective longitudinal study. The Standard-Binet Intelligence Scale-Fourth Edition (SB-IV) Standard Age Score (SAS) and Composite intelligence quotient (IQ) score are examined as cognitive outcomes in pediatric brain-tumor patients. Treatment factors, patient factors and time since diagnosis are taken into account as the risk factors. Hierarchical linear/quadratic models and Gompertz based hierarchical nonlinear growth models were applied to build linear and nonlinear longitudinal curves. We use PRESS and Volume Under the Surface (VUS) as the criterions to compare these two methods. Some model interpretations are presented in this dissertation.

19

Tang, Hong. "A Comparison of Two Modeling Techniques in Customer Targeting For Bank Telemarketing." 2014. http://scholarworks.gsu.edu/math_theses/139.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Customer targeting is the key to the success of bank telemarketing. To compare the flexible discriminant analysis and the logistic regression in customer targeting, a survey dataset from a Portuguese bank was used. For the flexible discriminant analysis model, the backward elimination of explanatory variables was used with several rounds of manual re-defining of dummy variables. For the logistic regression model, the automatic stepwise selection was performed to decide which explanatory variables should be left in the final model. Ten-fold stratified cross validation was performed to estimate the model parameters and accuracies. Although employing different sets of explanatory variables, the flexible discriminant analysis model and the logistic regression model show equally satisfactory performances in customer classification based on the areas under the receiver operating characteristic curves. Focusing on the predicted “right” customers, the logistic regression model shows slightly better classification and higher overall correct prediction rate.

20

Werner, Carola. "Nichtparametrische Analyse von diagnostischen Tests." Doctoral thesis, 2006. http://hdl.handle.net/11858/00-1735-0000-000D-F21E-A.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Lange, Katharina. "Nichtparametrische Analyse diagnostischer Gütemaße bei Clusterdaten." Doctoral thesis, 2011. http://hdl.handle.net/11858/00-1735-0000-000D-F1D1-B.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Bednář, Ondřej. "Srovnání modifikací predikčních bankrotních modelů." Master's thesis, 2017. http://www.nusl.cz/ntk/nusl-431270.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The goal of this theses is to compare existing bankruptcy prediction models with its new modification unique for this work, which could perform better than its competition. Proposed model is logit-based and consists of the combination of variables used in Altman´s and Ohlson´s models. The final model is estimated for medium sized companies in EU which aren´t publicly traded. This model achieved prediction accuracy of 97,1% (97.4% for healthy and 91.1% for bankrupt compa-nies) on its original dataset. As expected, when verified on new dataset, the accu-racy dropped but still reaches 97.1% (99.3% for healthy and 37.7% for bankrupt companies). The model is compared with its competition (original and modified version of Ohlson´s and partially Altman´s models) and it is shown that it has higher prediction accuracy.

23

Cruz, Rafael Cunha. "Determinants of bankruptcy in the portuguese shoe manufacturing industry." Master's thesis, 2020. http://hdl.handle.net/10400.14/32108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

O propósito desta tese é investigar o fenómeno das falências na indústria do calçado em Portugal, contruindo um modelo capaz de prever este fenómeno com 1 ano de antecedência e analisando que variáveis têm mais influência nesse desfecho. Para tal, uma abordagem baseada em modelos binários Logit, foi usada para construir (e mais tarde validar) 5 modelos baseados numa amostra de 2,073 empresas do sector industrial do calçado em Portugal, entre 2006 e 2018, onde se registou um total de 422 ocorrências que classificamos como falências. Validamos que 2 dos nossos modelos registaram uma capacidade de “discriminação aceitável” ao apresentarem uma AUC entre 0.7 e 0.8, o que também revelou que os rácios ligados à rentabilidade, à alavancagem e à liquidez são os que têm o maior impacto na probabilidade de falência.
The purpose of this thesis is to investigate bankruptcy in the Portuguese shoe manufacturing industry by building a model able to predict it within 1 year, and by analyzing which variables are its main drivers. Therefore, a logistic approach was taken to build (and later validate) 5 models out of a sample of 2,073 Portuguese shoe manufacturing firms, across 2006 until 2018, where there was a total of 422 bankruptcy-like events. We found 2 of our models to have an “acceptable discrimination” ability by presenting an AUC between 0.7 and 0.8 which also revealed that ratios related with profitability, leverage and liquidity are the ones with the most relevant impact in bankruptcy probability.

24

Stones, George. "Predicting Community-based Methadone Maintenance Treatment (MMT) Outcome." Thesis, 2012. http://hdl.handle.net/1807/34932.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This was a retrospective study of a community-based methadone maintenance treatment (MMT) program in Toronto. Participants (N = 170) were federally sentenced adult male offenders admitted to this voluntary program between 1997 and 2009 while subject to community supervision following incarceration. The primary investigation examined correlates of treatment responsivity, with principal outcome measures including MMT clients’ rates of: (i) illicit drug use; and (ii) completion of conditional (parole) or statutory release (SR). For a subset (n = 74), recidivism rates were examined after a 9-year interval. Findings included strong convergent evidence from logistic regression and ROC analyses that an empirically and theoretically derived set of five variables was a stable and highly significant (p <.001) predictor of release outcome. Using five factors related to risk (work/school status, security level of releasing institution, total PCL-R score, history of institutional drug use, and days at risk), release outcome was predicted with an overall classification accuracy of 88%, with high specificity (86%) and sensitivity (89%). The logistic regression model generated an R2 of .55 and the accompanying AUC was .89, both substantial. Work/school status had an extremely large positive association with successful completion of community supervision, accounting for > half of the total variance explained by the five-factor model and increasing the estimated odds of successful release outcome by > 15-fold. Also, when in the MMT program, clients' risk taking behaviour was significantly moderated, with low overall base rates of illicit drug use, yet the rate of parole/SR revocation (71%) was high. The 9-year follow-up showed a high mortality rate (15%) overall. Revocation of release while in the MMT program was associated with a significantly higher rate and more violent recidivism at follow-up. Results are discussed within the context of: (a) Andrews' and Bonta's psychology of criminal conduct; (b) the incompatibility of a harm reduction treatment model with an abstinence-based parole decision-making model; (c) changing drug use profiles among MMT clients; (d) a strength-based approach to correctional intervention focusing on educational and vocational retraining initiatives; and (e) creation of a user friendly case-based screening algorithm for prediction of release outcome for new releases.

25

Melo, André Pestana Sampaio e. "Cálculo do limite superior para a capacidade discriminante de modelos preditivos baseados na informação disponível – variáveis dependentes dicotómicas." Master's thesis, 2011. http://hdl.handle.net/10362/8293.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação.
Quando se avalia o poder discriminante de um determinado modelo (com variável dependente dicotómica) recorrendo à curva ROC, é usual representar-se no mesmo gráfico o “Modelo perfeito” e o “Modelo aleatório” enquanto limites teóricos (superior e inferior) à capacidade discriminante. O presente trabalho propõe o cálculo de um limite superior complementar, derivado dos dados e conceptualmente distinto do obtido via o “Modelo perfeito”. Este novo limite designar-se-á “Capacidade discriminante dos dados” utilizados no desenvolvimento do(s) modelo(s) e encontra-se associado ao modelo Classificador Probabilista AP (Probabilistic a Posteriori Classifier). A utilidade desta abordagem passa por permitir, numa vertente mais prática, a estimação a priori (antes do trabalho exaustivo de modelação propriamente dito) da qualidade potencial dos dados para endereçar o problema de previsão em questão, bem como ajudar na rápida triagem das variáveis mais promissoras a incluir no futuro modelo preditivo a desenvolver. Numa vertente mais teórica, esta abordagem possibilita uma avaliação e uma comparação da capacidade efectiva que diferentes modelos preditivos apresentam na captura da capacidade discriminante encerrada nos dados. Complementa-se os resultados teóricos com ilustrações empíricas obtidas a partir do ajustamento de duas metodologias distintas - Regressão Logística e Redes Neuronais – a dados de um ficheiro contendo informação sobre o comportamento creditício de 46,000 Clientes. Os resultados práticos tornam ainda evidente como se relaciona o “novo” limite com o tema do overfitting.

Dissertations / Theses on the topic 'Auc-Roc'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles