Se connecter

Bibliographies thématiques / Interpretable ML / Articles de revues

Pour voir les autres types de publications sur ce sujet consultez le lien suivant : Interpretable ML.

Articles de revues sur le sujet « Interpretable ML »

Auteur : Grafiati

Publié le 9 mars 2023

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 50 meilleurs articles de revues pour votre recherche sur le sujet « Interpretable ML ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les articles de revues sur diverses disciplines et organisez correctement votre bibliographie.

1

Zytek, Alexandra, Ignacio Arnaldo, Dongyu Liu, Laure Berti-Equille et Kalyan Veeramachaneni. « The Need for Interpretable Features ». ACM SIGKDD Explorations Newsletter 24, n^o 1 (2 juin 2022) : 1–13. http://dx.doi.org/10.1145/3544903.3544905.

Texte intégral

Résumé :

Through extensive experience developing and explaining machine learning (ML) applications for real-world domains, we have learned that ML models are only as interpretable as their features. Even simple, highly interpretable model types such as regression models can be difficult or impossible to understand if they use uninterpretable features. Different users, especially those using ML models for decision-making in their domains, may require different levels and types of feature interpretability. Furthermore, based on our experiences, we claim that the term "interpretable feature" is not specific nor detailed enough to capture the full extent to which features impact the usefulness of ML explanations. In this paper, we motivate and discuss three key lessons: 1) more attention should be given to what we refer to as the interpretable feature space, or the state of features that are useful to domain experts taking real-world actions, 2) a formal taxonomy is needed of the feature properties that may be required by these domain experts (we propose a partial taxonomy in this paper), and 3) transforms that take data from the model-ready state to an interpretable form are just as essential as traditional ML transforms that prepare features for the model.

Styles APA, Harvard, Vancouver, ISO, etc.

2

Wu, Bozhi, Sen Chen, Cuiyun Gao, Lingling Fan, Yang Liu, Weiping Wen et Michael R. Lyu. « Why an Android App Is Classified as Malware ». ACM Transactions on Software Engineering and Methodology 30, n^o 2 (mars 2021) : 1–29. http://dx.doi.org/10.1145/3423096.

Texte intégral

Résumé :

Machine learning–(ML) based approach is considered as one of the most promising techniques for Android malware detection and has achieved high accuracy by leveraging commonly used features. In practice, most of the ML classifications only provide a binary label to mobile users and app security analysts. However, stakeholders are more interested in the reason why apps are classified as malicious in both academia and industry. This belongs to the research area of interpretable ML but in a specific research domain (i.e., mobile malware detection). Although several interpretable ML methods have been exhibited to explain the final classification results in many cutting-edge Artificial Intelligent–based research fields, until now, there is no study interpreting why an app is classified as malware or unveiling the domain-specific challenges. In this article, to fill this gap, we propose a novel and interpretable ML-based approach (named XMal ) to classify malware with high accuracy and explain the classification result meanwhile. (1) The first classification phase of XMal hinges multi-layer perceptron and attention mechanism and also pinpoints the key features most related to the classification result. (2) The second interpreting phase aims at automatically producing neural language descriptions to interpret the core malicious behaviors within apps. We evaluate the behavior description results by leveraging a human study and an in-depth quantitative analysis. Moreover, we further compare XMal with the existing interpretable ML-based methods (i.e., Drebin and LIME) to demonstrate the effectiveness of XMal . We find that XMal is able to reveal the malicious behaviors more accurately. Additionally, our experiments show that XMal can also interpret the reason why some samples are misclassified by ML classifiers. Our study peeks into the interpretable ML through the research of Android malware detection and analysis.

Styles APA, Harvard, Vancouver, ISO, etc.

3

Yang, Ziduo, Weihe Zhong, Lu Zhao et Calvin Yu-Chian Chen. « ML-DTI : Mutual Learning Mechanism for Interpretable Drug–Target Interaction Prediction ». Journal of Physical Chemistry Letters 12, n^o 17 (27 avril 2021) : 4247–61. http://dx.doi.org/10.1021/acs.jpclett.1c00867.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

4

Lin, Zhiqing. « A Methodological Review of Machine Learning in Applied Linguistics ». English Language Teaching 14, n^o 1 (23 décembre 2020) : 74. http://dx.doi.org/10.5539/elt.v14n1p74.

Texte intégral

Résumé :

The traditional linear regression in applied linguistics (AL) suffers from the drawbacks arising from the strict assumptions namely: linearity, and normality, etc. More advanced methods are needed to overcome the shortcomings of the traditional method and grapple with intricate linguistic problems. However, there is no previous review on the applications of machine learning (ML) in AL, the introduction of interpretable ML, and related practical software. This paper addresses these gaps by reviewing the representative algorithms of ML in AL. The result shows that ML is applicable in AL and enjoys a promising future. It goes further to discuss the applications of interpretable ML for reporting the results in AL. Finally, it ends with the recommendations of the practical programming languages, software, and platforms to implement ML for researchers in AL to foster the interdisciplinary studies between AL and ML.

Styles APA, Harvard, Vancouver, ISO, etc.

5

Abdullah, Talal A. A., Mohd Soperi Mohd Zahid et Waleed Ali. « A Review of Interpretable ML in Healthcare : Taxonomy, Applications, Challenges, and Future Directions ». Symmetry 13, n^o 12 (17 décembre 2021) : 2439. http://dx.doi.org/10.3390/sym13122439.

Texte intégral

Résumé :

We have witnessed the impact of ML in disease diagnosis, image recognition and classification, and many more related fields. Healthcare is a sensitive field related to people’s lives in which decisions need to be carefully taken based on solid evidence. However, most ML models are complex, i.e., black-box, meaning they do not provide insights into how the problems are solved or why such decisions are proposed. This lack of interpretability is the main reason why some ML models are not widely used yet in real environments such as healthcare. Therefore, it would be beneficial if ML models could provide explanations allowing physicians to make data-driven decisions that lead to higher quality service. Recently, several efforts have been made in proposing interpretable machine learning models to become more convenient and applicable in real environments. This paper aims to provide a comprehensive survey and symmetry phenomena of IML models and their applications in healthcare. The fundamental characteristics, theoretical underpinnings needed to develop IML, and taxonomy for IML are presented. Several examples of how they are applied in healthcare are investigated to encourage and facilitate the use of IML models in healthcare. Furthermore, current limitations, challenges, and future directions that might impact applying ML in healthcare are addressed.

Styles APA, Harvard, Vancouver, ISO, etc.

6

Sajid, Mirza Rizwan, Arshad Ali Khan, Haitham M. Albar, Noryanti Muhammad, Waqas Sami, Syed Ahmad Chan Bukhari et Iram Wajahat. « Exploration of Black Boxes of Supervised Machine Learning Models : A Demonstration on Development of Predictive Heart Risk Score ». Computational Intelligence and Neuroscience 2022 (12 mai 2022) : 1–11. http://dx.doi.org/10.1155/2022/5475313.

Texte intégral

Résumé :

Machine learning (ML) often provides applicable high-performance models to facilitate decision-makers in various fields. However, this high performance is achieved at the expense of the interpretability of these models, which has been criticized by practitioners and has become a significant hindrance in their application. Therefore, in highly sensitive decisions, black boxes of ML models are not recommended. We proposed a novel methodology that uses complex supervised ML models and transforms them into simple, interpretable, transparent statistical models. This methodology is like stacking ensemble ML in which the best ML models are used as a base learner to compute relative feature weights. The index of these weights is further used as a single covariate in the simple logistic regression model to estimate the likelihood of an event. We tested this methodology on the primary dataset related to cardiovascular diseases (CVDs), the leading cause of mortalities in recent times. Therefore, early risk assessment is an important dimension that can potentially reduce the burden of CVDs and their related mortality through accurate but interpretable risk prediction models. We developed an artificial neural network and support vector machines based on ML models and transformed them into a simple statistical model and heart risk scores. These simplified models were found transparent, reliable, valid, interpretable, and approximate in predictions. The findings of this study suggest that complex supervised ML models can be efficiently transformed into simple statistical models that can also be validated.

Styles APA, Harvard, Vancouver, ISO, etc.

7

Singh, Devesh. « Interpretable Machine-Learning Approach in Estimating FDI Inflow : Visualization of ML Models with LIME and H2O ». TalTech Journal of European Studies 11, n^o 1 (1 mai 2021) : 133–52. http://dx.doi.org/10.2478/bjes-2021-0009.

Texte intégral

Résumé :

Abstract In advancement of interpretable machine learning (IML), this research proposes local interpretable model-agnostic explanations (LIME) as a new visualization technique in a novel informative way to analyze the foreign direct investment (FDI) inflow. This article examines the determinants of FDI inflow through IML with a supervised learning method to analyze the foreign investment determinants in Hungary by using an open-source artificial intelligence H2O platform. This author used three ML algorithms—general linear model (GML), gradient boosting machine (GBM), and random forest (RF) classifier—to analyze the FDI inflow from 2001 to 2018. The result of this study shows that in all three classifiers GBM performs better to analyze FDI inflow determinants. The variable value of production in a region is the most influenced determinant to the inflow of FDI in Hungarian regions. Explanatory visualizations are presented from the analyzed dataset, which leads to their use in decision-making.

Styles APA, Harvard, Vancouver, ISO, etc.

8

Carreiro Pinasco, Gustavo, Eduardo Moreno Júdice de Mattos Farina, Fabiano Novaes Barcellos Filho, Willer França Fiorotti, Matheus Coradini Mariano Ferreira, Sheila Cristina de Souza Cruz, Andre Louzada Colodette et al. « An interpretable machine learning model for covid-19 screening ». Journal of Human Growth and Development 32, n^o 2 (23 juin 2022) : 268–74. http://dx.doi.org/10.36311/jhgd.v32.13324.

Texte intégral

Résumé :

Introduction: the Coronavirus Disease 2019 (COVID-19) is a viral disease which has been declared a pandemic by the WHO. Diagnostic tests are expensive and are not always available. Researches using machine learning (ML) approach for diagnosing SARS-CoV-2 infection have been proposed in the literature to reduce cost and allow better control of the pandemic. Objective: we aim to develop a machine learning model to predict if a patient has COVID-19 with epidemiological data and clinical features. Methods: we used six ML algorithms for COVID-19 screening through diagnostic prediction and did an interpretative analysis using SHAP models and feature importances. Results: our best model was XGBoost (XGB) which obtained an area under the ROC curve of 0.752, a sensitivity of 90%, a specificity of 40%, a positive predictive value (PPV) of 42.16%, and a negative predictive value (NPV) of 91.0%. The best predictors were fever, cough, history of international travel less than 14 days ago, male gender, and nasal congestion, respectively. Conclusion: We conclude that ML is an important tool for screening with high sensitivity, compared to rapid tests, and can be used to empower clinical precision in COVID-19, a disease in which symptoms are very unspecific.

Styles APA, Harvard, Vancouver, ISO, etc.

9

Menon, P. Archana, et Dr R. Gunasundari. « Study of Interpretability in ML Algorithms for Disease Prognosis ». Revista Gestão Inovação e Tecnologias 11, n^o 4 (19 août 2021) : 4735–49. http://dx.doi.org/10.47059/revistageintec.v11i4.2500.

Texte intégral

Résumé :

Disease prognosis plays an important role in healthcare. Diagnosing disease at an early stage is crucial to provide treatment to the patient at the earliest in order to save his/her life or to at least reduce the severity of the disease. Application of Machine Learning algorithms is a promising area for the early and accurate diagnosis of chronic diseases. The black-box approach of Machine Learning models has been circumvented by providing different Interpretability methods. The importance of interpretability in health care field especially while taking decisions in life threatening diseases is crucial. Interpretable model increases the confidence of a medical practitioner in taking decisions. This paper gives an insight to the importance of explanations as well as the interpretability methods applied to different machine learning and deep learning models developed in recent years.

Styles APA, Harvard, Vancouver, ISO, etc.

10

Dawid, Anna, Patrick Huembeli, Michał Tomza, Maciej Lewenstein et Alexandre Dauphin. « Hessian-based toolbox for reliable and interpretable machine learning in physics ». Machine Learning : Science and Technology 3, n^o 1 (24 novembre 2021) : 015002. http://dx.doi.org/10.1088/2632-2153/ac338d.

Texte intégral

Résumé :

Abstract Machine learning (ML) techniques applied to quantum many-body physics have emerged as a new research field. While the numerical power of this approach is undeniable, the most expressive ML algorithms, such as neural networks, are black boxes: The user does neither know the logic behind the model predictions nor the uncertainty of the model predictions. In this work, we present a toolbox for interpretability and reliability, agnostic of the model architecture. In particular, it provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an extrapolation score for the model predictions. Such a toolbox only requires a single computation of the Hessian of the training loss function. Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.

Styles APA, Harvard, Vancouver, ISO, etc.

11

Bölte, Jens, Bernadette M. Jansma, Anna Zilverstand et Pienie Zwitserlood. « Derivational morphology approached with event-related potentials ». Mental Lexicon 4, n^o 3 (15 décembre 2009) : 336–53. http://dx.doi.org/10.1075/ml.4.3.02bol.

Texte intégral

Résumé :

We investigated the processing of derived adjectives in German using event-related potentials (ERPs). ERPs were registered to existing adjectives (freundlich, ‘friendly’), to morphologically complex pseudowords that were synonymous to an existing adjective and thus interpretable (*freundhaft), and to complex pseudowords that were structurally and semantically anomalous (*freundbar). Stimuli were embedded in sentence contexts, displayed word by word. An ERP effect with a left-frontal maximum was observed around 450–500 ms after stimulus onset. In this window, both pseudoword types differed from existing adjectives. We interpret this data pattern as a LAN, reflecting structural problems due to morphological parsing, a process that is distinct from semantic processing.

Styles APA, Harvard, Vancouver, ISO, etc.

12

Rajczakowska, Magdalena, Maciej Szeląg, Karin Habermehl-Cwirzen, Hans Hedlund et Andrzej Cwirzen. « Interpretable Machine Learning for Prediction of Post-Fire Self-Healing of Concrete ». Materials 16, n^o 3 (2 février 2023) : 1273. http://dx.doi.org/10.3390/ma16031273.

Texte intégral

Résumé :

Developing accurate and interpretable models to forecast concrete’s self-healing behavior is of interest to material engineers, scientists, and civil engineering contractors. Machine learning (ML) and artificial intelligence are powerful tools that allow constructing high-precision predictions, yet often considered “black box” methods due to their complexity. Those approaches are commonly used for the modeling of mechanical properties of concrete with exceptional accuracy; however, there are few studies dealing with the application of ML for the self-healing of cementitious materials. This paper proposes a pioneering study on the utilization of ML for predicting post-fire self-healing of concrete. A large database is constructed based on the literature studies. Twelve input variables are analyzed: w/c, age of concrete, amount of cement, fine aggregate, coarse aggregate, peak loading temperature, duration of peak loading temperature, cooling regime, duration of cooling, curing regime, duration of curing, and specimen volume. The output of the model is the compressive strength recovery, being one of the self-healing efficiency indicators. Four ML methods are optimized and compared based on their performance error: Support Vector Machines (SVM), Regression Trees (RT), Artificial Neural Networks (ANN), and Ensemble of Regression Trees (ET). Monte Carlo analysis is conducted to verify the stability of the selected model. All ML approaches demonstrate satisfying precision, twice as good as linear regression. The ET model is found to be the most optimal with the highest prediction accuracy and sufficient robustness. Model interpretation is performed using Partial Dependence Plots and Individual Conditional Expectation Plots. Temperature, curing regime, and amounts of aggregates are identified as the most significant predictors.

Styles APA, Harvard, Vancouver, ISO, etc.

13

Bohanec, Marko, Marko Robnik-Šikonja et Mirjana Kljajić Borštnar. « Decision-making framework with double-loop learning through interpretable black-box machine learning models ». Industrial Management & ; Data Systems 117, n^o 7 (14 août 2017) : 1389–406. http://dx.doi.org/10.1108/imds-09-2016-0409.

Texte intégral

Résumé :

Purpose The purpose of this paper is to address the problem of weak acceptance of machine learning (ML) models in business. The proposed framework of top-performing ML models coupled with general explanation methods provides additional information to the decision-making process. This builds a foundation for sustainable organizational learning. Design/methodology/approach To address user acceptance, participatory approach of action design research (ADR) was chosen. The proposed framework is demonstrated on a B2B sales forecasting process in an organizational setting, following cross-industry standard process for data mining (CRISP-DM) methodology. Findings The provided ML model explanations efficiently support business decision makers, reduce forecasting error for new sales opportunities, and facilitate discussion about the context of opportunities in the sales team. Research limitations/implications The quality and quantity of available data affect the performance of models and explanations. Practical implications The application in the real-world company demonstrates the utility of the approach and provides evidence that transparent explanations of ML models contribute to individual and organizational learning. Social implications All used methods are available as an open-source software and can improve the acceptance of ML in data-driven decision making. Originality/value The proposed framework incorporates existing ML models and general explanation methodology into a decision-making process. To the authors’ knowledge, this is the first attempt to support organizational learning with a framework combining ML explanations, ADR, and data mining methodology based on the CRISP-DM industry standard.

Styles APA, Harvard, Vancouver, ISO, etc.

14

Kim, Eui-Jin. « Analysis of Travel Mode Choice in Seoul Using an Interpretable Machine Learning Approach ». Journal of Advanced Transportation 2021 (1 mars 2021) : 1–13. http://dx.doi.org/10.1155/2021/6685004.

Texte intégral

Résumé :

Understanding choice behavior regarding travel mode is essential in forecasting travel demand. Machine learning (ML) approaches have been proposed to model mode choice behavior, and their usefulness for predicting performance has been reported. However, due to the black-box nature of ML, it is difficult to determine a suitable explanation for the relationship between the input and output variables. This paper proposes an interpretable ML approach to improve the interpretability (i.e., the degree of understanding the cause of decisions) of ML concerning travel mode choice modeling. This approach applied to national household travel survey data in Seoul. First, extreme gradient boosting (XGB) was applied to travel mode choice modeling, and the XGB outperformed the other ML models. Variable importance, variable interaction, and accumulated local effects (ALE) were measured to interpret the prediction of the best-performing XGB. The results of variable importance and interaction indicated that the correlated trip- and tour-related variables significantly influence predicting travel mode choice by the main and cross effects between them. Age and number of trips on tour were also shown to be an important variable in choosing travel mode. ALE measured the main effect of variables that have a nonlinear relation to choice probability, which cannot be observed in the conventional multinomial logit model. This information can provide interesting behavioral insights on urban mobility.

Styles APA, Harvard, Vancouver, ISO, etc.

15

Zafar, Muhammad Rehman, et Naimul Khan. « Deterministic Local Interpretable Model-Agnostic Explanations for Stable Explainability ». Machine Learning and Knowledge Extraction 3, n^o 3 (30 juin 2021) : 525–41. http://dx.doi.org/10.3390/make3030027.

Texte intégral

Résumé :

Local Interpretable Model-Agnostic Explanations (LIME) is a popular technique used to increase the interpretability and explainability of black box Machine Learning (ML) algorithms. LIME typically creates an explanation for a single prediction by any ML model by learning a simpler interpretable model (e.g., linear classifier) around the prediction through generating simulated data around the instance by random perturbation, and obtaining feature importance through applying some form of feature selection. While LIME and similar local algorithms have gained popularity due to their simplicity, the random perturbation methods result in shifts in data and instability in the generated explanations, where for the same prediction, different explanations can be generated. These are critical issues that can prevent deployment of LIME in sensitive domains. We propose a deterministic version of LIME. Instead of random perturbation, we utilize Agglomerative Hierarchical Clustering (AHC) to group the training data together and K-Nearest Neighbour (KNN) to select the relevant cluster of the new instance that is being explained. After finding the relevant cluster, a simple model (i.e., linear model or decision tree) is trained over the selected cluster to generate the explanations. Experimental results on six public (three binary and three multi-class) and six synthetic datasets show the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME), where we quantitatively determine the stability and faithfulness of DLIME compared to LIME.

Styles APA, Harvard, Vancouver, ISO, etc.

16

Ntampaka, Michelle, et Alexey Vikhlinin. « The Importance of Being Interpretable : Toward an Understandable Machine Learning Encoder for Galaxy Cluster Cosmology ». Astrophysical Journal 926, n^o 1 (1 février 2022) : 45. http://dx.doi.org/10.3847/1538-4357/ac423e.

Texte intégral

Résumé :

Abstract We present a deep machine-learning (ML) approach to constraining cosmological parameters with multiwavelength observations of galaxy clusters. The ML approach has two components: an encoder that builds a compressed representation of each galaxy cluster and a flexible convolutional neural networks to estimate the cosmological model from a cluster sample. It is trained and tested on simulated cluster catalogs built from the Magneticum simulations. From the simulated catalogs, the ML method estimates the amplitude of matter fluctuations, σ 8, at approximately the expected theoretical limit. More importantly, the deep ML approach can be interpreted. We lay out three schemes for interpreting the ML technique: a leave-one-out method for assessing cluster importance, an average saliency for evaluating feature importance, and correlations in the terse layer for understanding whether an ML technique can be safely applied to observational data. These interpretation schemes led to the discovery of a previously unknown self-calibration mode for flux- and volume-limited cluster surveys. We describe this new mode, which uses the amplitude and peak of the cluster mass probability density function as anchors for mass calibration. We introduce the term overspecialized to describe a common pitfall in astronomical applications of ML in which the ML method learns simulation-specific details, and we show how a carefully constructed architecture can be used to check for this source of systematic error.

Styles APA, Harvard, Vancouver, ISO, etc.

17

Liu, Fang, Xiaodi Wang, Ting Li, Mingzeng Huang, Tao Hu, Yunfeng Wen et Yunche Su. « An Automated and Interpretable Machine Learning Scheme for Power System Transient Stability Assessment ». Energies 16, n^o 4 (16 février 2023) : 1956. http://dx.doi.org/10.3390/en16041956.

Texte intégral

Résumé :

Many repeated manual feature adjustments and much heuristic parameter tuning are required during the debugging of machine learning (ML)-based transient stability assessment (TSA) of power systems. Furthermore, the results produced by ML-based TSA are often not explainable. This paper handles both the automation and interpretability issues of ML-based TSA. An automated machine learning (AutoML) scheme is proposed which consists of auto-feature selection, CatBoost, Bayesian optimization, and performance evaluation. CatBoost, as a new ensemble ML method, is implemented to achieve fast, scalable, and high performance for online TSA. To enable faster deployment and reduce the heavy dependence on human expertise, auto-feature selection and Bayesian optimization, respectively, are introduced to automatically determine the best input features and optimal hyperparameters. Furthermore, to help operators understand the prediction of stable/unstable TSA, an interpretability analysis based on the Shapley additive explanation (SHAP), is embedded into both offline and online phases of the AutoML framework. Test results on IEEE 39-bus system, IEEE 118-bus system, and a practical large-scale power system, demonstrate that the proposed approach achieves more accurate and certain appropriate trust solutions while saving a substantial amount of time in comparison to other methods.

Styles APA, Harvard, Vancouver, ISO, etc.

18

Barbosa, Poliana Goncalves, et Elena Nicoladis. « Deverbal compound comprehension in preschool children ». Mental Lexicon 11, n^o 1 (7 juin 2016) : 94–114. http://dx.doi.org/10.1075/ml.11.1.05bar.

Texte intégral

Résumé :

When English-speaking children first attempt to produce deverbal compound words (like muffin maker), they often misorder the noun and the verb (e.g., make-muffin, maker muffin, or making-muffin). The purpose of the present studies was to test Usage-based and Distributional Morphology-based explanations of children’s errors. In Study 1, we compared three to four-year old children’s interpretations of Verb-Noun (e.g., push-ball) to Verb-erNoun (e.g., pusher-ball). In Study 2, we compared three- to five-year old children’s interpretations of Verb-erNoun (e.g., pusher-ball) to Noun-Verb-er (e.g., ball pusher). Results from both studies suggest that while preschool children’s understanding of deverbal compounds is still developing, they already show some sensitivity to word ordering within compounds. We argue that these results are interpretable within Usage-based approaches.

Styles APA, Harvard, Vancouver, ISO, etc.

19

Hicks, Steven, Debesh Jha, Vajira Thambawita, Pål Halvorsen, Bjørn-Jostein Singstad, Sachin Gaur, Klas Pettersen et al. « MedAI : Transparency in Medical Image Segmentation ». Nordic Machine Intelligence 1, n^o 1 (1 novembre 2021) : 1–4. http://dx.doi.org/10.5617/nmi.9140.

Texte intégral

Résumé :

MedAI: Transparency in Medical Image Segmentation is a challenge held for the first time at the Nordic AI Meet that focuses on medical image segmentation and transparency in machine learning (ML)-based systems. We propose three tasks to meet specific gastrointestinal image segmentation challenges collected from experts within the field, including two separate segmentation scenarios and one scenario on transparent ML systems. The latter emphasizes the need for explainable and interpretable ML algorithms. We provide a development dataset for the participants to train their ML models, tested on a concealed test dataset.

Styles APA, Harvard, Vancouver, ISO, etc.

20

Lee, Dongwoo, John Mulrow, Chana Joanne Haboucha, Sybil Derrible et Yoram Shiftan. « Attitudes on Autonomous Vehicle Adoption using Interpretable Gradient Boosting Machine ». Transportation Research Record : Journal of the Transportation Research Board 2673, n^o 11 (23 juin 2019) : 865–78. http://dx.doi.org/10.1177/0361198119857953.

Texte intégral

Résumé :

This article applies machine learning (ML) to develop a choice model on three choice alternatives related to autonomous vehicles (AV): regular vehicle (REG), private AV (PAV), and shared AV (SAV). The learned model is used to examine users’ preferences and behaviors on AV uptake by car commuters. Specifically, this study applies gradient boosting machine (GBM) to stated preference (SP) survey data (i.e., panel data). GBM notably possesses more interpretable features than other ML methods as well as high predictive performance for panel data. The prediction performance of GBM is evaluated by conducting a 5-fold cross-validation and shows around 80% accuracy. To interpret users’ behaviors, variable importance (VI) and partial dependence (PD) were measured. The results of VI indicate that trip cost, purchase cost, and subscription cost are the most influential variables in selecting an alternative. Moreover, the attitudinal variables Pro-AV Sentiment and Environmental Concern are also shown to be significant. The article also examines the sensitivity of choice by using the PD of the log-odds on selected important factors. The results inform both the modeling of transportation technology uptake and the configuration and interpretation of GBM that can be applied for policy analysis.

Styles APA, Harvard, Vancouver, ISO, etc.

21

Shrotri, Aditya A., Nina Narodytska, Alexey Ignatiev, Kuldeep S. Meel, Joao Marques-Silva et Moshe Y. Vardi. « Constraint-Driven Explanations for Black-Box ML Models ». Proceedings of the AAAI Conference on Artificial Intelligence 36, n^o 8 (28 juin 2022) : 8304–14. http://dx.doi.org/10.1609/aaai.v36i8.20805.

Texte intégral

Résumé :

The need to understand the inner workings of opaque Machine Learning models has prompted researchers to devise various types of post-hoc explanations. A large class of such explainers proceed in two phases: first perturb an input instance whose explanation is sought, and then generate an interpretable artifact to explain the prediction of the opaque model on that instance. Recently, Deutch and Frost proposed to use an additional input from the user: a set of constraints over the input space to guide the perturbation phase. While this approach affords the user the ability to tailor the explanation to their needs, striking a balance between flexibility, theoretical rigor and computational cost has remained an open challenge. We propose a novel constraint-driven explanation generation approach which simultaneously addresses these issues in a modular fashion. Our framework supports the use of expressive Boolean constraints giving the user more flexibility to specify the subspace to generate perturbations from. Leveraging advances in Formal Methods, we can theoretically guarantee strict adherence of the samples to the desired distribution. This also allows us to compute fidelity in a rigorous way, while scaling much better in practice. Our empirical study demonstrates concrete uses of our tool CLIME in obtaining more meaningful explanations with high fidelity.

Styles APA, Harvard, Vancouver, ISO, etc.

22

Luo, Yi, Huan-Hsin Tseng, Sunan Cui, Lise Wei, Randall K. Ten Haken et Issam El Naqa. « Balancing accuracy and interpretability of machine learning approaches for radiation treatment outcomes modeling ». BJR|Open 1, n^o 1 (juillet 2019) : 20190021. http://dx.doi.org/10.1259/bjro.20190021.

Texte intégral

Résumé :

Radiation outcomes prediction (ROP) plays an important role in personalized prescription and adaptive radiotherapy. A clinical decision may not only depend on an accurate radiation outcomes’ prediction, but also needs to be made based on an informed understanding of the relationship among patients’ characteristics, radiation response and treatment plans. As more patients’ biophysical information become available, machine learning (ML) techniques will have a great potential for improving ROP. Creating explainable ML methods is an ultimate task for clinical practice but remains a challenging one. Towards complete explainability, the interpretability of ML approaches needs to be first explored. Hence, this review focuses on the application of ML techniques for clinical adoption in radiation oncology by balancing accuracy with interpretability of the predictive model of interest. An ML algorithm can be generally classified into an interpretable (IP) or non-interpretable (NIP) (“black box”) technique. While the former may provide a clearer explanation to aid clinical decision-making, its prediction performance is generally outperformed by the latter. Therefore, great efforts and resources have been dedicated towards balancing the accuracy and the interpretability of ML approaches in ROP, but more still needs to be done. In this review, current progress to increase the accuracy for IP ML approaches is introduced, and major trends to improve the interpretability and alleviate the “black box” stigma of ML in radiation outcomes modeling are summarized. Efforts to integrate IP and NIP ML approaches to produce predictive models with higher accuracy and interpretability for ROP are also discussed.

Styles APA, Harvard, Vancouver, ISO, etc.

23

Schnur, Christopher, Payman Goodarzi, Yevgeniya Lugovtsova, Jannis Bulling, Jens Prager, Kilian Tschöke, Jochen Moll, Andreas Schütze et Tizian Schneider. « Towards Interpretable Machine Learning for Automated Damage Detection Based on Ultrasonic Guided Waves ». Sensors 22, n^o 1 (5 janvier 2022) : 406. http://dx.doi.org/10.3390/s22010406.

Texte intégral

Résumé :

Data-driven analysis for damage assessment has a large potential in structural health monitoring (SHM) systems, where sensors are permanently attached to the structure, enabling continuous and frequent measurements. In this contribution, we propose a machine learning (ML) approach for automated damage detection, based on an ML toolbox for industrial condition monitoring. The toolbox combines multiple complementary algorithms for feature extraction and selection and automatically chooses the best combination of methods for the dataset at hand. Here, this toolbox is applied to a guided wave-based SHM dataset for varying temperatures and damage locations, which is freely available on the Open Guided Waves platform. A classification rate of 96.2% is achieved, demonstrating reliable and automated damage detection. Moreover, the ability of the ML model to identify a damaged structure at untrained damage locations and temperatures is demonstrated.

Styles APA, Harvard, Vancouver, ISO, etc.

24

Eder, Matthias, Emanuel Moser, Andreas Holzinger, Claire Jean-Quartier et Fleur Jeanquartier. « Interpretable Machine Learning with Brain Image and Survival Data ». BioMedInformatics 2, n^o 3 (6 septembre 2022) : 492–510. http://dx.doi.org/10.3390/biomedinformatics2030031.

Texte intégral

Résumé :

Recent developments in research on artificial intelligence (AI) in medicine deal with the analysis of image data such as Magnetic Resonance Imaging (MRI) scans to support the of decision-making of medical personnel. For this purpose, machine learning (ML) algorithms are often used, which do not explain the internal decision-making process at all. Thus, it is often difficult to validate or interpret the results of the applied AI methods. This manuscript aims to overcome this problem by using methods of explainable AI (XAI) to interpret the decision-making of an ML algorithm in the use case of predicting the survival rate of patients with brain tumors based on MRI scans. Therefore, we explore the analysis of brain images together with survival data to predict survival in gliomas with a focus on improving the interpretability of the results. Using the Brain Tumor Segmentation dataset BraTS 2020, we used a well-validated dataset for evaluation and relied on a convolutional neural network structure to improve the explainability of important features by adding Shapley overlays. The trained network models were used to evaluate SHapley Additive exPlanations (SHAP) directly and were not optimized for accuracy. The resulting overfitting of some network structures is therefore seen as a use case of the presented interpretation method. It is shown that the network structure can be validated by experts using visualizations, thus making the decision-making of the method interpretable. Our study highlights the feasibility of combining explainers with 3D voxels and also the fact that the interpretation of prediction results significantly supports the evaluation of results. The implementation in python is available on gitlab as “XAIforBrainImgSurv”.

Styles APA, Harvard, Vancouver, ISO, etc.

25

Gadzinski, Gregory, et Alessio Castello. « Combining white box models, black box machines and human interventions for interpretable decision strategies ». Judgment and Decision Making 17, n^o 3 (mai 2022) : 598–627. http://dx.doi.org/10.1017/s1930297500003594.

Texte intégral

Résumé :

AbstractGranting a short-term loan is a critical decision. A great deal of research has concerned the prediction of credit default, notably through Machine Learning (ML) algorithms. However, given that their black-box nature has sometimes led to unwanted outcomes, comprehensibility in ML guided decision-making strategies has become more important. In many domains, transparency and accountability are no longer optional. In this article, instead of opposing white-box against black-box models, we use a multi-step procedure that combines the Fast and Frugal Tree (FFT) methodology of Martignon et al. (2005) and Phillips et al. (2017) with the extraction of post-hoc explainable information from ensemble ML models. New interpretable models are then built thanks to the inclusion of explainable ML outputs chosen by human intervention. Our methodology improves significantly the accuracy of the FFT predictions while preserving their explainable nature. We apply our approach to a dataset of short-term loans granted to borrowers in the UK, and show how complex machine learning can challenge simpler machines and help decision makers.

Styles APA, Harvard, Vancouver, ISO, etc.

26

Cakiroglu, Celal, Kamrul Islam, Gebrail Bekdaş, Sanghun Kim et Zong Woo Geem. « Interpretable Machine Learning Algorithms to Predict the Axial Capacity of FRP-Reinforced Concrete Columns ». Materials 15, n^o 8 (8 avril 2022) : 2742. http://dx.doi.org/10.3390/ma15082742.

Texte intégral

Résumé :

Fiber-reinforced polymer (FRP) rebars are increasingly being used as an alternative to steel rebars in reinforced concrete (RC) members due to their excellent corrosion resistance capability and enhanced mechanical properties. Extensive research works have been performed in the last two decades to develop predictive models, codes, and guidelines to estimate the axial load-carrying capacity of FRP-RC columns. This study utilizes the power of artificial intelligence and develops an alternative approach to predict the axial capacity of FRP-RC columns more accurately using data-driven machine learning (ML) algorithms. A database of 117 tests of axially loaded FRP-RC columns is collected from the literature. The geometric and material properties, column shape and slenderness ratio, reinforcement details, and FRP types are used as the input variables, while the load-carrying capacity is used as the output response to develop the ML models. Furthermore, the input-output relationship of the ML model is explained through feature importance analysis and the SHapely Additive exPlanations (SHAP) approach. Eight ML models, namely, Kernel Ridge Regression, Lasso Regression, Support Vector Machine, Gradient Boosting Machine, Adaptive Boosting, Random Forest, Categorical Gradient Boosting, and Extreme Gradient Boosting, are used in this study for capacity prediction, and their relative performances are compared to identify the best-performing ML model. Finally, predictive equations are proposed using the harmony search optimization and the model interpretations obtained through the SHAP algorithm.

Styles APA, Harvard, Vancouver, ISO, etc.

27

Park, Jurn-Gyu, Nikil Dutt et Sung-Soo Lim. « An Interpretable Machine Learning Model Enhanced Integrated CPU-GPU DVFS Governor ». ACM Transactions on Embedded Computing Systems 20, n^o 6 (30 novembre 2021) : 1–28. http://dx.doi.org/10.1145/3470974.

Texte intégral

Résumé :

Modern heterogeneous CPU-GPU-based mobile architectures, which execute intensive mobile gaming/graphics applications, use software governors to achieve high performance with energy-efficiency. However, existing governors typically utilize simple statistical or heuristic models, assuming linear relationships using a small unbalanced dataset of mobile games; and the limitations result in high prediction errors for dynamic and diverse gaming workloads on heterogeneous platforms. To overcome these limitations, we propose an interpretable machine learning (ML) model enhanced integrated CPU-GPU governor: (1) It builds tree-based piecewise linear models (i.e., model trees) offline considering both high accuracy (low error) and interpretable ML models based on mathematical formulas using a simulatability operation counts quantitative metric. And then (2) it deploys the selected models for online estimation into an integrated CPU-GPU Dynamic Voltage Frequency Scaling governor. Our experiments on a test set of 20 mobile games exhibiting diverse characteristics show that our governor achieved significant energy efficiency gains of over 10% (up to 38%) improvements on average in energy-per-frame with a surprising-but-modest 3% improvement in Frames-per-Second performance, compared to a typical state-of-the-art governor that employs simple linear regression models.

Styles APA, Harvard, Vancouver, ISO, etc.

28

Li, Fa, Qing Zhu, William J. Riley, Lei Zhao, Li Xu, Kunxiaojia Yuan, Min Chen et al. « AttentionFire_v1.0 : interpretable machine learning fire model for burned-area predictions over tropics ». Geoscientific Model Development 16, n^o 3 (3 février 2023) : 869–84. http://dx.doi.org/10.5194/gmd-16-869-2023.

Texte intégral

Résumé :

Abstract. African and South American (ASA) wildfires account for more than 70 % of global burned areas and have strong connection to local climate for sub-seasonal to seasonal wildfire dynamics. However, representation of the wildfire–climate relationship remains challenging due to spatiotemporally heterogenous responses of wildfires to climate variability and human influences. Here, we developed an interpretable machine learning (ML) fire model (AttentionFire_v1.0) to resolve the complex controls of climate and human activities on burned areas and to better predict burned areas over ASA regions. Our ML fire model substantially improved predictability of burned areas for both spatial and temporal dynamics compared with five commonly used machine learning models. More importantly, the model revealed strong time-lagged control from climate wetness on the burned areas. The model also predicted that, under a high-emission future climate scenario, the recently observed declines in burned area will reverse in South America in the near future due to climate changes. Our study provides a reliable and interpretable fire model and highlights the importance of lagged wildfire–climate relationships in historical and future predictions.

Styles APA, Harvard, Vancouver, ISO, etc.

29

Chaibi, Mohamed, EL Mahjoub Benghoulam, Lhoussaine Tarik, Mohamed Berrada et Abdellah El Hmaidi. « An Interpretable Machine Learning Model for Daily Global Solar Radiation Prediction ». Energies 14, n^o 21 (5 novembre 2021) : 7367. http://dx.doi.org/10.3390/en14217367.

Texte intégral

Résumé :

Machine learning (ML) models are commonly used in solar modeling due to their high predictive accuracy. However, the predictions of these models are difficult to explain and trust. This paper aims to demonstrate the utility of two interpretation techniques to explain and improve the predictions of ML models. We compared first the predictive performance of Light Gradient Boosting (LightGBM) with three benchmark models, including multilayer perceptron (MLP), multiple linear regression (MLR), and support-vector regression (SVR), for estimating the global solar radiation (H) in the city of Fez, Morocco. Then, the predictions of the most accurate model were explained by two model-agnostic explanation techniques: permutation feature importance (PFI) and Shapley additive explanations (SHAP). The results indicated that LightGBM (R2 = 0.9377, RMSE = 0.4827 kWh/m2, MAE = 0.3614 kWh/m2) provides similar predictive accuracy as SVR, and outperformed MLP and MLR in the testing stage. Both PFI and SHAP methods showed that extraterrestrial solar radiation (H0) and sunshine duration fraction (SF) are the two most important parameters that affect H estimation. Moreover, the SHAP method established how each feature influences the LightGBM estimations. The predictive accuracy of the LightGBM model was further improved slightly after re-examination of features, where the model combining H0, SF, and RH was better than the model with all features.

Styles APA, Harvard, Vancouver, ISO, etc.

30

Aslam, Nida, Irfan Ullah Khan, Samiha Mirza, Alanoud AlOwayed, Fatima M. Anis, Reef M. Aljuaid et Reham Baageel. « Interpretable Machine Learning Models for Malicious Domains Detection Using Explainable Artificial Intelligence (XAI) ». Sustainability 14, n^o 12 (16 juin 2022) : 7375. http://dx.doi.org/10.3390/su14127375.

Texte intégral

Résumé :

With the expansion of the internet, a major threat has emerged involving the spread of malicious domains intended by attackers to perform illegal activities aiming to target governments, violating privacy of organizations, and even manipulating everyday users. Therefore, detecting these harmful domains is necessary to combat the growing network attacks. Machine Learning (ML) models have shown significant outcomes towards the detection of malicious domains. However, the “black box” nature of the complex ML models obstructs their wide-ranging acceptance in some of the fields. The emergence of Explainable Artificial Intelligence (XAI) has successfully incorporated the interpretability and explicability in the complex models. Furthermore, the post hoc XAI model has enabled the interpretability without affecting the performance of the models. This study aimed to propose an Explainable Artificial Intelligence (XAI) model to detect malicious domains on a recent dataset containing 45,000 samples of malicious and non-malicious domains. In the current study, initially several interpretable ML models, such as Decision Tree (DT) and Naïve Bayes (NB), and black box ensemble models, such as Random Forest (RF), Extreme Gradient Boosting (XGB), AdaBoost (AB), and Cat Boost (CB) algorithms, were implemented and found that XGB outperformed the other classifiers. Furthermore, the post hoc XAI global surrogate model (Shapley additive explanations) and local surrogate LIME were used to generate the explanation of the XGB prediction. Two sets of experiments were performed; initially the model was executed using a preprocessed dataset and later with selected features using the Sequential Forward Feature selection algorithm. The results demonstrate that ML algorithms were able to distinguish benign and malicious domains with overall accuracy ranging from 0.8479 to 0.9856. The ensemble classifier XGB achieved the highest result, with an AUC and accuracy of 0.9991 and 0.9856, respectively, before the feature selection algorithm, while there was an AUC of 0.999 and accuracy of 0.9818 after the feature selection algorithm. The proposed model outperformed the benchmark study.

Styles APA, Harvard, Vancouver, ISO, etc.

31

Thekke Kanapram, Divya, Lucio Marcenaro, David Martin Gomez et Carlo Regazzoni. « Graph-Powered Interpretable Machine Learning Models for Abnormality Detection in Ego-Things Network ». Sensors 22, n^o 6 (15 mars 2022) : 2260. http://dx.doi.org/10.3390/s22062260.

Texte intégral

Résumé :

In recent days, it is becoming essential to ensure that the outcomes of signal processing methods based on machine learning (ML) data-driven models can provide interpretable predictions. The interpretability of ML models can be defined as the capability to understand the reasons that contributed to generating a given outcome in a complex autonomous or semi-autonomous system. The necessity of interpretability is often related to the evaluation of performances in complex systems and the acceptance of agents’ automatization processes where critical high-risk decisions have to be taken. This paper concentrates on one of the core functionality of such systems, i.e., abnormality detection, and on choosing a model representation modality based on a data-driven machine learning (ML) technique such that the outcomes become interpretable. The interpretability in this work is achieved through graph matching of semantic level vocabulary generated from the data and their relationships. The proposed approach assumes that the data-driven models to be chosen should support emergent self-awareness (SA) of the agents at multiple abstraction levels. It is demonstrated that the capability of incrementally updating learned representation models based on progressive experiences of the agent is shown to be strictly related to interpretability capability. As a case study, abnormality detection is analyzed as a primary feature of the collective awareness (CA) of a network of vehicles performing cooperative behaviors. Each vehicle is considered an example of an Internet of Things (IoT) node, therefore providing results that can be generalized to an IoT framework where agents have different sensors, actuators, and tasks to be accomplished. The capability of a model to allow evaluation of abnormalities at different levels of abstraction in the learned models is addressed as a key aspect for interpretability.

Styles APA, Harvard, Vancouver, ISO, etc.

32

Hu, Hao, Marie-José Huguet et Mohamed Siala. « Optimizing Binary Decision Diagrams with MaxSAT for Classification ». Proceedings of the AAAI Conference on Artificial Intelligence 36, n^o 4 (28 juin 2022) : 3767–75. http://dx.doi.org/10.1609/aaai.v36i4.20291.

Texte intégral

Résumé :

The growing interest in explainable artificial intelligence(XAI) for critical decision making motivates the need for interpretable machine learning (ML) models. In fact, due to their structure (especially with small sizes), these models are inherently understandable by humans. Recently, several exact methods for computing such models are proposed to overcome weaknesses of traditional heuristic methods by providing more compact models or better prediction quality. Despite their compressed representation of Boolean functions, Binary decision diagrams (BDDs) did not gain enough interest as other interpretable ML models. In this paper, we first propose SAT-based models for learning optimal BDDs (in terms of the number of features) that classify all input examples. Then, we lift the encoding to a MaxSAT model to learn optimal BDDs in limited depths, that maximize the number of examples correctly classified. Finally, we tackle the fragmentation problem by introducing a method to merge compatible subtrees for the BDDs found via the MaxSAT model. Our empirical study shows clear benefits of the proposed approach in terms of prediction quality and interpretability (i.e., lighter size) compared to the state-of-the-art approaches.

Styles APA, Harvard, Vancouver, ISO, etc.

33

Khanna, Varada Vivek, Krishnaraj Chadaga, Niranajana Sampathila, Srikanth Prabhu, Venkatesh Bhandage et Govardhan K. Hegde. « A Distinctive Explainable Machine Learning Framework for Detection of Polycystic Ovary Syndrome ». Applied System Innovation 6, n^o 2 (23 février 2023) : 32. http://dx.doi.org/10.3390/asi6020032.

Texte intégral

Résumé :

Polycystic Ovary Syndrome (PCOS) is a complex disorder predominantly defined by biochemical hyperandrogenism, oligomenorrhea, anovulation, and in some cases, the presence of ovarian microcysts. This endocrinopathy inhibits ovarian follicle development causing symptoms like obesity, acne, infertility, and hirsutism. Artificial Intelligence (AI) has revolutionized healthcare, contributing remarkably to science and engineering domains. Therefore, we have demonstrated an AI approach using heterogeneous Machine Learning (ML) and Deep Learning (DL) classifiers to predict PCOS among fertile patients. We used an Open-source dataset of 541 patients from Kerala, India. Among all the classifiers, the final multi-stack of ML models performed best with accuracy, precision, recall, and F1-score of 98%, 97%, 98%, and 98%. Explainable AI (XAI) techniques make model predictions understandable, interpretable, and trustworthy. Hence, we have utilized XAI techniques such as SHAP (SHapley Additive Values), LIME (Local Interpretable Model Explainer), ELI5, Qlattice, and feature importance with Random Forest for explaining tree-based classifiers. The motivation of this study is to accurately detect PCOS in patients while simultaneously proposing an automated screening architecture with explainable machine learning tools to assist medical professionals in decision-making.

Styles APA, Harvard, Vancouver, ISO, etc.

34

Combs, Kara, Mary Fendley et Trevor Bihl. « A Preliminary Look at Heuristic Analysis for Assessing Artificial Intelligence Explainability ». WSEAS TRANSACTIONS ON COMPUTER RESEARCH 8 (1 juin 2020) : 61–72. http://dx.doi.org/10.37394/232018.2020.8.9.

Texte intégral

Résumé :

Artificial Intelligence and Machine Learning (AI/ML) models are increasingly criticized for their “black-box” nature. Therefore, eXplainable AI (XAI) approaches to extract human-interpretable decision processes from algorithms have been explored. However, XAI research lacks understanding of algorithmic explainability from a human factors’ perspective. This paper presents a repeatable human factors heuristic analysis for XAI with a demonstration on four decision tree classifier algorithms.

Styles APA, Harvard, Vancouver, ISO, etc.

35

Lakkad, Aditya Kamleshbhai, Rushit Dharmendrabhai Bhadaniya, Vraj Nareshkumar Shah et Lavanya K. « Complex Events Processing on Live News Events Using Apache Kafka and Clustering Techniques ». International Journal of Intelligent Information Technologies 17, n^o 1 (janvier 2021) : 39–52. http://dx.doi.org/10.4018/ijiit.2021010103.

Texte intégral

Résumé :

The explosive growth of news and news content generated worldwide, coupled with the expansion through online media and rapid access to data, has made trouble and screening of news tedious. An expanding need for a model that can reprocess, break down, and order main content to extract interpretable information, explicitly recognizing subjects and content-driven groupings of articles. This paper proposed automated analyzing heterogeneous news through complex event processing (CEP) and machine learning (ML) algorithms. Initially, news content streamed using Apache Kafka, stored in Apache Druid, and further processed by a blend of natural language processing (NLP) and unsupervised machine learning (ML) techniques.

Styles APA, Harvard, Vancouver, ISO, etc.

36

Navidi, Zeinab, Jesse Sun, Raymond H. Chan, Kate Hanneman, Amna Al-Arnawoot, Alif Munim, Harry Rakowski et al. « Interpretable machine learning for automated left ventricular scar quantification in hypertrophic cardiomyopathy patients ». PLOS Digital Health 2, n^o 1 (4 janvier 2023) : e0000159. http://dx.doi.org/10.1371/journal.pdig.0000159.

Texte intégral

Résumé :

Scar quantification on cardiovascular magnetic resonance (CMR) late gadolinium enhancement (LGE) images is important in risk stratifying patients with hypertrophic cardiomyopathy (HCM) due to the importance of scar burden in predicting clinical outcomes. We aimed to develop a machine learning (ML) model that contours left ventricular (LV) endo- and epicardial borders and quantifies CMR LGE images from HCM patients.We retrospectively studied 2557 unprocessed images from 307 HCM patients followed at the University Health Network (Canada) and Tufts Medical Center (USA). LGE images were manually segmented by two experts using two different software packages. Using 6SD LGE intensity cutoff as the gold standard, a 2-dimensional convolutional neural network (CNN) was trained on 80% and tested on the remaining 20% of the data. Model performance was evaluated using the Dice Similarity Coefficient (DSC), Bland-Altman, and Pearson’s correlation. The 6SD model DSC scores were good to excellent at 0.91 ± 0.04, 0.83 ± 0.03, and 0.64 ± 0.09 for the LV endocardium, epicardium, and scar segmentation, respectively. The bias and limits of agreement for the percentage of LGE to LV mass were low (-0.53 ± 2.71%), and correlation high (r = 0.92). This fully automated interpretable ML algorithm allows rapid and accurate scar quantification from CMR LGE images. This program does not require manual image pre-processing, and was trained with multiple experts and software, increasing its generalizability.

Styles APA, Harvard, Vancouver, ISO, etc.

37

Izza, Yacine, Alexey Ignatiev et Joao Marques-Silva. « On Tackling Explanation Redundancy in Decision Trees ». Journal of Artificial Intelligence Research 75 (29 septembre 2022) : 261–321. http://dx.doi.org/10.1613/jair.1.13575.

Texte intégral

Résumé :

Decision trees (DTs) epitomize the ideal of interpretability of machine learning (ML) models. The interpretability of decision trees motivates explainability approaches by so-called intrinsic interpretability, and it is at the core of recent proposals for applying interpretable ML models in high-risk applications. The belief in DT interpretability is justified by the fact that explanations for DT predictions are generally expected to be succinct. Indeed, in the case of DTs, explanations correspond to DT paths. Since decision trees are ideally shallow, and so paths contain far fewer features than the total number of features, explanations in DTs are expected to be succinct, and hence interpretable. This paper offers both theoretical and experimental arguments demonstrating that, as long as interpretability of decision trees equates with succinctness of explanations, then decision trees ought not be deemed interpretable. The paper introduces logically rigorous path explanations and path explanation redundancy, and proves that there exist functions for which decision trees must exhibit paths with explanation redundancy that is arbitrarily larger than the actual path explanation. The paper also proves that only a very restricted class of functions can be represented with DTs that exhibit no explanation redundancy. In addition, the paper includes experimental results substantiating that path explanation redundancy is observed ubiquitously in decision trees, including those obtained using different tree learning algorithms, but also in a wide range of publicly available decision trees. The paper also proposes polynomial-time algorithms for eliminating path explanation redundancy, which in practice require negligible time to compute. Thus, these algorithms serve to indirectly attain irreducible, and so succinct, explanations for decision trees. Furthermore, the paper includes novel results related with duality and enumeration of explanations, based on using SAT solvers as witness-producing NP-oracles.

Styles APA, Harvard, Vancouver, ISO, etc.

38

McGovern, Amy, Ryan Lagerquist, David John Gagne, G. Eli Jergensen, Kimberly L. Elmore, Cameron R. Homeyer et Travis Smith. « Making the Black Box More Transparent : Understanding the Physical Implications of Machine Learning ». Bulletin of the American Meteorological Society 100, n^o 11 (novembre 2019) : 2175–99. http://dx.doi.org/10.1175/bams-d-18-0195.1.

Texte intégral

Résumé :

AbstractThis paper synthesizes multiple methods for machine learning (ML) model interpretation and visualization (MIV) focusing on meteorological applications. ML has recently exploded in popularity in many fields, including meteorology. Although ML has been successful in meteorology, it has not been as widely accepted, primarily due to the perception that ML models are “black boxes,” meaning the ML methods are thought to take inputs and provide outputs but not to yield physically interpretable information to the user. This paper introduces and demonstrates multiple MIV techniques for both traditional ML and deep learning, to enable meteorologists to understand what ML models have learned. We discuss permutation-based predictor importance, forward and backward selection, saliency maps, class-activation maps, backward optimization, and novelty detection. We apply these methods at multiple spatiotemporal scales to tornado, hail, winter precipitation type, and convective-storm mode. By analyzing such a wide variety of applications, we intend for this work to demystify the black box of ML, offer insight in applying MIV techniques, and serve as a MIV toolbox for meteorologists and other physical scientists.

Styles APA, Harvard, Vancouver, ISO, etc.

39

Guo, Ganggui, Shanshan Li, Yakun Liu, Ze Cao et Yangyu Deng. « Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach ». International Journal of Environmental Research and Public Health 20, n^o 1 (30 décembre 2022) : 702. http://dx.doi.org/10.3390/ijerph20010702.

Texte intégral

Résumé :

The cavity length, which is a vital index in aeration and corrosion reduction engineering, is affected by many factors and is challenging to calculate. In this study, 10-fold cross-validation was performed to select the optimal input configuration. Additionally, the hyperparameters of three ensemble learning models—random forest (RF), gradient boosting decision tree (GBDT), and extreme gradient boosting tree (XGBOOST)—were fine-tuned by the Bayesian optimization (BO) algorithm to improve the prediction accuracy and compare the five empirical methods. The XGBOOST method was observed to present the highest prediction accuracy. Further interpretability analysis carried out using the Sobol method demonstrated its ability to reasonably capture the varying relative significance of different input features under different flow conditions. The Sobol sensitivity analysis also observed two patterns of extracting information from the input features in ML models: (1) the main effect of individual features in ensemble learning and (2) the interactive effect between each feature in SVR. From the results, the models obtaining individual information both predict the cavity length more accurately than that using interactive information. Subsequently, the XGBOOST captures more correct information from features, which leads to the varied Sobol index in accordance with outside phenomena; meanwhile, the predicted results fit the experimental points best.

Styles APA, Harvard, Vancouver, ISO, etc.

40

Jaafreh, Russlan, Jung-Gu Kim et Kotiba Hamad. « Interpretable Machine Learning Analysis of Stress Concentration in Magnesium : An Insight beyond the Black Box of Predictive Modeling ». Crystals 12, n^o 9 (2 septembre 2022) : 1247. http://dx.doi.org/10.3390/cryst12091247.

Texte intégral

Résumé :

In the present work, machine learning (ML) was employed to build a model, and through it, the microstructural features (parameters) affecting the stress concentration (SC) during plastic deformation of magnesium (Mg)-based materials are determined. As a descriptor for the SC, the kernel average misorientation (KAM) was used, and starting from the microstructural features of pure Mg and AZ31 Mg alloy, as recorded using electron backscattered diffraction (EBSD), the ML model was trained and constructed using various types of ML algorithms, including Logistic Regression (LR), Decision Trees (DT), Random Forest (RF), Naive Bayes Classifier (NBC), K-Nearest Neighbor (KNN), Multilayer Perceptron (MLP), and Extremely Randomized Trees (ERT). The results show that the accuracy of the ERT-based model was higher compared to other models, and accordingly, the nine most-important features in the ERT-based model, those with a Gini impurity higher than 0.025, were extracted. The feature importance showed that the grain size is the most effective microstructural parameter for controlling the SC in Mg-based materials, and according to the relative Accumulated Local Effects (ALE) plot, calculated to show the relationship between KAM and grain size, it was found that SC occurs with a lower probability in the fine range of grain size. All findings from the ML-based model built in the present work were experimentally confirmed through EBSD observations.

Styles APA, Harvard, Vancouver, ISO, etc.

41

Bertsimas, Dimitris, Daisy Zhuo, Jack Dunn, Jordan Levine, Eugenio Zuccarelli, Nikos Smyrnakis, Zdzislaw Tobota, Bohdan Maruszewski, Jose Fragata et George E. Sarris. « Adverse Outcomes Prediction for Congenital Heart Surgery : A Machine Learning Approach ». World Journal for Pediatric and Congenital Heart Surgery 12, n^o 4 (28 avril 2021) : 453–60. http://dx.doi.org/10.1177/21501351211007106.

Texte intégral

Résumé :

Objective: Risk assessment tools typically used in congenital heart surgery (CHS) assume that various possible risk factors interact in a linear and additive fashion, an assumption that may not reflect reality. Using artificial intelligence techniques, we sought to develop nonlinear models for predicting outcomes in CHS. Methods: We built machine learning (ML) models to predict mortality, postoperative mechanical ventilatory support time (MVST), and hospital length of stay (LOS) for patients who underwent CHS, based on data of more than 235,000 patients and 295,000 operations provided by the European Congenital Heart Surgeons Association Congenital Database. We used optimal classification trees (OCTs) methodology for its interpretability and accuracy, and compared to logistic regression and state-of-the-art ML methods (Random Forests, Gradient Boosting), reporting their area under the curve (AUC or c-statistic) for both training and testing data sets. Results: Optimal classification trees achieve outstanding performance across all three models (mortality AUC = 0.86, prolonged MVST AUC = 0.85, prolonged LOS AUC = 0.82), while being intuitively interpretable. The most significant predictors of mortality are procedure, age, and weight, followed by days since previous admission and any general preoperative patient risk factors. Conclusions: The nonlinear ML-based models of OCTs are intuitively interpretable and provide superior predictive power. The associated risk calculator allows easy, accurate, and understandable estimation of individual patient risks, in the theoretical framework of the average performance of all centers represented in the database. This methodology has the potential to facilitate decision-making and resource optimization in CHS, enabling total quality management and precise benchmarking initiatives.

Styles APA, Harvard, Vancouver, ISO, etc.

42

Wöber, Wilfried, Manuel Curto, Papius Tibihika, Paul Meulenbroek, Esayas Alemayehu, Lars Mehnen, Harald Meimberg et Peter Sykacek. « Identifying geographically differentiated features of Ethopian Nile tilapia (Oreochromis niloticus) morphology with machine learning ». PLOS ONE 16, n^o 4 (15 avril 2021) : e0249593. http://dx.doi.org/10.1371/journal.pone.0249593.

Texte intégral

Résumé :

Visual characteristics are among the most important features for characterizing the phenotype of biological organisms. Color and geometric properties define population phenotype and allow assessing diversity and adaptation to environmental conditions. To analyze geometric properties classical morphometrics relies on biologically relevant landmarks which are manually assigned to digital images. Assigning landmarks is tedious and error prone. Predefined landmarks may in addition miss out on information which is not obvious to the human eye. The machine learning (ML) community has recently proposed new data analysis methods which by uncovering subtle features in images obtain excellent predictive accuracy. Scientific credibility demands however that results are interpretable and hence to mitigate the black-box nature of ML methods. To overcome the black-box nature of ML we apply complementary methods and investigate internal representations with saliency maps to reliably identify location specific characteristics in images of Nile tilapia populations. Analyzing fish images which were sampled from six Ethiopian lakes reveals that deep learning improves on a conventional morphometric analysis in predictive performance. A critical assessment of established saliency maps with a novel significance test reveals however that the improvement is aided by artifacts which have no biological interpretation. More interpretable results are obtained by a Bayesian approach which allows us to identify genuine Nile tilapia body features which differ in dependence of the animals habitat. We find that automatically inferred Nile tilapia body features corroborate and expand the results of a landmark based analysis that the anterior dorsum, the fish belly, the posterior dorsal region and the caudal fin show signs of adaptation to the fish habitat. We may thus conclude that Nile tilapia show habitat specific morphotypes and that a ML analysis allows inferring novel biological knowledge in a reproducible manner.

Styles APA, Harvard, Vancouver, ISO, etc.

43

Alsayegh, Faisal, Moh A. Alkhamis, Fatima Ali, Sreeja Attur, Nicholas M. Fountain-Jones et Mohammad Zubaid. « Anemia or other comorbidities ? using machine learning to reveal deeper insights into the drivers of acute coronary syndromes in hospital admitted patients ». PLOS ONE 17, n^o 1 (24 janvier 2022) : e0262997. http://dx.doi.org/10.1371/journal.pone.0262997.

Texte intégral

Résumé :

Acute coronary syndromes (ACS) are a leading cause of deaths worldwide, yet the diagnosis and treatment of this group of diseases represent a significant challenge for clinicians. The epidemiology of ACS is extremely complex and the relationship between ACS and patient risk factors is typically non-linear and highly variable across patient lifespan. Here, we aim to uncover deeper insights into the factors that shape ACS outcomes in hospitals across four Arabian Gulf countries. Further, because anemia is one of the most observed comorbidities, we explored its role in the prognosis of most prevalent ACS in-hospital outcomes (mortality, heart failure, and bleeding) in the region. We used a robust multi-algorithm interpretable machine learning (ML) pipeline, and 20 relevant risk factors to fit predictive models to 4,044 patients presenting with ACS between 2012 and 2013. We found that in-hospital heart failure followed by anemia was the most important predictor of mortality. However, anemia was the first most important predictor for both in-hospital heart failure, and bleeding. For all in-hospital outcome, anemia had remarkably non-linear relationships with both ACS outcomes and patients’ baseline characteristics. With minimal statistical assumptions, our ML models had reasonable predictive performance (AUCs > 0.75) and substantially outperformed commonly used statistical and risk stratification methods. Moreover, our pipeline was able to elucidate ACS risk of individual patients based on their unique risk factors. Fully interpretable ML approaches are rarely used in clinical settings, particularly in the Middle East, but have the potential to improve clinicians’ prognostic efforts and guide policymakers in reducing the health and economic burdens of ACS worldwide.

Styles APA, Harvard, Vancouver, ISO, etc.

44

Wongvibulsin, Shannon, Katherine C. Wu et Scott L. Zeger. « Improving Clinical Translation of Machine Learning Approaches Through Clinician-Tailored Visual Displays of Black Box Algorithms : Development and Validation ». JMIR Medical Informatics 8, n^o 6 (9 juin 2020) : e15791. http://dx.doi.org/10.2196/15791.

Texte intégral

Résumé :

Background Despite the promise of machine learning (ML) to inform individualized medical care, the clinical utility of ML in medicine has been limited by the minimal interpretability and black box nature of these algorithms. Objective The study aimed to demonstrate a general and simple framework for generating clinically relevant and interpretable visualizations of black box predictions to aid in the clinical translation of ML. Methods To obtain improved transparency of ML, simplified models and visual displays can be generated using common methods from clinical practice such as decision trees and effect plots. We illustrated the approach based on postprocessing of ML predictions, in this case random forest predictions, and applied the method to data from the Left Ventricular (LV) Structural Predictors of Sudden Cardiac Death (SCD) Registry for individualized risk prediction of SCD, a leading cause of death. Results With the LV Structural Predictors of SCD Registry data, SCD risk predictions are obtained from a random forest algorithm that identifies the most important predictors, nonlinearities, and interactions among a large number of variables while naturally accounting for missing data. The black box predictions are postprocessed using classification and regression trees into a clinically relevant and interpretable visualization. The method also quantifies the relative importance of an individual or a combination of predictors. Several risk factors (heart failure hospitalization, cardiac magnetic resonance imaging indices, and serum concentration of systemic inflammation) can be clearly visualized as branch points of a decision tree to discriminate between low-, intermediate-, and high-risk patients. Conclusions Through a clinically important example, we illustrate a general and simple approach to increase the clinical translation of ML through clinician-tailored visual displays of results from black box algorithms. We illustrate this general model-agnostic framework by applying it to SCD risk prediction. Although we illustrate the methods using SCD prediction with random forest, the methods presented are applicable more broadly to improving the clinical translation of ML, regardless of the specific ML algorithm or clinical application. As any trained predictive model can be summarized in this manner to a prespecified level of precision, we encourage the use of simplified visual displays as an adjunct to the complex predictive model. Overall, this framework can allow clinicians to peek inside the black box and develop a deeper understanding of the most important features from a model to gain trust in the predictions and confidence in applying them to clinical care.

Styles APA, Harvard, Vancouver, ISO, etc.

45

Khadem, Heydar, Hoda Nemat, Jackie Elliott et Mohammed Benaissa. « Interpretable Machine Learning for Inpatient COVID-19 Mortality Risk Assessments : Diabetes Mellitus Exclusive Interplay ». Sensors 22, n^o 22 (12 novembre 2022) : 8757. http://dx.doi.org/10.3390/s22228757.

Texte intégral

Résumé :

People with diabetes mellitus (DM) are at elevated risk of in-hospital mortality from coronavirus disease-2019 (COVID-19). This vulnerability has spurred efforts to pinpoint distinctive characteristics of COVID-19 patients with DM. In this context, the present article develops ML models equipped with interpretation modules for inpatient mortality risk assessments of COVID-19 patients with DM. To this end, a cohort of 156 hospitalised COVID-19 patients with pre-existing DM is studied. For creating risk assessment platforms, this work explores a pool of historical, on-admission, and during-admission data that are DM-related or, according to preliminary investigations, are exclusively attributed to the COVID-19 susceptibility of DM patients. First, a set of careful pre-modelling steps are executed on the clinical data, including cleaning, pre-processing, subdivision, and feature elimination. Subsequently, standard machine learning (ML) modelling analysis is performed on the cured data. Initially, a classifier is tasked with forecasting COVID-19 fatality from selected features. The model undergoes thorough evaluation analysis. The results achieved substantiate the efficacy of the undertaken data curation and modelling steps. Afterwards, SHapley Additive exPlanations (SHAP) technique is assigned to interpret the generated mortality risk prediction model by rating the predictors’ global and local influence on the model’s outputs. These interpretations advance the comprehensibility of the analysis by explaining the formation of outcomes and, in this way, foster the adoption of the proposed methodologies. Next, a clustering algorithm demarcates patients into four separate groups based on their SHAP values, providing a practical risk stratification method. Finally, a re-evaluation analysis is performed to verify the robustness of the proposed framework.

Styles APA, Harvard, Vancouver, ISO, etc.

46

Estivill-Castro, Vladimir, Eugene Gilmore et René Hexel. « Constructing Explainable Classifiers from the Start—Enabling Human-in-the Loop Machine Learning ». Information 13, n^o 10 (29 septembre 2022) : 464. http://dx.doi.org/10.3390/info13100464.

Texte intégral

Résumé :

Interactive machine learning (IML) enables the incorporation of human expertise because the human participates in the construction of the learned model. Moreover, with human-in-the-loop machine learning (HITL-ML), the human experts drive the learning, and they can steer the learning objective not only for accuracy but perhaps for characterisation and discrimination rules, where separating one class from others is the primary objective. Moreover, this interaction enables humans to explore and gain insights into the dataset as well as validate the learned models. Validation requires transparency and interpretable classifiers. The huge relevance of understandable classification has been recently emphasised for many applications under the banner of explainable artificial intelligence (XAI). We use parallel coordinates to deploy an IML system that enables the visualisation of decision tree classifiers but also the generation of interpretable splits beyond parallel axis splits. Moreover, we show that characterisation and discrimination rules are also well communicated using parallel coordinates. In particular, we report results from the largest usability study of a IML system, confirming the merits of our approach.

Styles APA, Harvard, Vancouver, ISO, etc.

47

Daly, Elizabeth M., Massimiliano Mattetti, Öznur Alkan et Rahul Nair. « User Driven Model Adjustment via Boolean Rule Explanations ». Proceedings of the AAAI Conference on Artificial Intelligence 35, n^o 7 (18 mai 2021) : 5896–904. http://dx.doi.org/10.1609/aaai.v35i7.16737.

Texte intégral

Résumé :

AI solutions are heavily dependant on the quality and accuracy of the input training data, however the training data may not always fully reflect the most up-to-date policy landscape or may be missing business logic. The advances in explainability have opened the possibility of allowing users to interact with interpretable explanations of ML predictions in order to inject modifications or constraints that more accurately reflect current realities of the system. In this paper, we present a solution which leverages the predictive power of ML models while allowing the user to specify modifications to decision boundaries. Our interactive overlay approach achieves this goal without requiring model retraining, making it appropriate for systems that need to apply instant changes to their decision making. We demonstrate that user feedback rules can be layered with the ML predictions to provide immediate changes which in turn supports learning with less data.

Styles APA, Harvard, Vancouver, ISO, etc.

48

Bermejo, Pablo, Alicia Vivo, Pedro J. Tárraga et J. A. Rodríguez-Montes. « Development of Interpretable Predictive Models for BPH and Prostate Cancer ». Clinical Medicine Insights : Oncology 9 (janvier 2015) : CMO.S19739. http://dx.doi.org/10.4137/cmo.s19739.

Texte intégral

Résumé :

Background Traditional methods for deciding whether to recommend a patient for a prostate biopsy are based on cut-off levels of stand-alone markers such as prostate-specific antigen (PSA) or any of its derivatives. However, in the last decade we have seen the increasing use of predictive models that combine, in a non-linear manner, several predictives that are better able to predict prostate cancer (PC), but these fail to help the clinician to distinguish between PC and benign prostate hyperplasia (BPH) patients. We construct two new models that are capable of predicting both PC and BPH. Methods An observational study was performed on 150 patients with PSA ≥3 ng/mL and age >50 years. We built a decision tree and a logistic regression model, validated with the leave-one-out methodology, in order to predict PC or BPH, or reject both. Results Statistical dependence with PC and BPH was found for prostate volume ( P-value < 0.001), PSA ( P-value < 0.001), international prostate symptom score (IPSS; P-value < 0.001), digital rectal examination (DRE; P-value < 0.001), age ( P-value < 0.002), antecedents ( P-value < 0.006), and meat consumption ( P-value < 0.08). The two predictive models that were constructed selected a subset of these, namely, volume, PSA, DRE, and IPSS, obtaining an area under the ROC curve (AUC) between 72% and 80% for both PC and BPH prediction. Conclusion PSA and volume together help to build predictive models that accurately distinguish among PC, BPH, and patients without any of these pathologies. Our decision tree and logistic regression models outperform the AUC obtained in the compared studies. Using these models as decision support, the number of unnecessary biopsies might be significantly reduced.

Styles APA, Harvard, Vancouver, ISO, etc.

49

De Cannière, Hélène, Federico Corradi, Christophe J. P. Smeets, Melanie Schoutteten, Carolina Varon, Chris Van Hoof, Sabine Van Huffel, Willemijn Groenendaal et Pieter Vandervoort. « Wearable Monitoring and Interpretable Machine Learning Can Objectively Track Progression in Patients during Cardiac Rehabilitation ». Sensors 20, n^o 12 (26 juin 2020) : 3601. http://dx.doi.org/10.3390/s20123601.

Texte intégral

Résumé :

Cardiovascular diseases (CVD) are often characterized by their multifactorial complexity. This makes remote monitoring and ambulatory cardiac rehabilitation (CR) therapy challenging. Current wearable multimodal devices enable remote monitoring. Machine learning (ML) and artificial intelligence (AI) can help in tackling multifaceted datasets. However, for clinical acceptance, easy interpretability of the AI models is crucial. The goal of the present study was to investigate whether a multi-parameter sensor could be used during a standardized activity test to interpret functional capacity in the longitudinal follow-up of CR patients. A total of 129 patients were followed for 3 months during CR using 6-min walking tests (6MWT) equipped with a wearable ECG and accelerometer device. Functional capacity was assessed based on 6MWT distance (6MWD). Linear and nonlinear interpretable models were explored to predict 6MWD. The t-distributed stochastic neighboring embedding (t-SNE) technique was exploited to embed and visualize high dimensional data. The performance of support vector machine (SVM) models, combining different features and using different kernel types, to predict functional capacity was evaluated. The SVM model, using chronotropic response and effort as input features, showed a mean absolute error of 42.8 m (±36.8 m). The 3D-maps derived using the t-SNE technique visualized the relationship between sensor-derived biomarkers and functional capacity, which enables tracking of the evolution of patients throughout the CR program. The current study showed that wearable monitoring combined with interpretable ML can objectively track clinical progression in a CR population. These results pave the road towards ambulatory CR.

Styles APA, Harvard, Vancouver, ISO, etc.

50

Dimitriadis, Ilias, Konstantinos Georgiou et Athena Vakali. « Social Botomics : A Systematic Ensemble ML Approach for Explainable and Multi-Class Bot Detection ». Applied Sciences 11, n^o 21 (21 octobre 2021) : 9857. http://dx.doi.org/10.3390/app11219857.

Texte intégral

Résumé :

OSN platforms are under attack by intruders born and raised within their own ecosystems. These attacks have multiple scopes from mild critiques to violent offences targeting individual or community rights and opinions. Negative publicity on microblogging platforms, such as Twitter, is due to the infamous Twitter bots which highly impact posts’ circulation and virality. A wide and ongoing research effort has been devoted to develop appropriate countermeasures against emerging “armies of bots”. However, the battle against bots is still intense and unfortunately, it seems to lean on the bot-side. Since, in an effort to win any war, it is critical to know your enemy, this work aims to demystify, reveal, and widen inherent characteristics of Twitter bots such that multiple types of bots are recognized and spotted early. More specifically in this work we: (i) extensively analyze the importance and the type of data and features used to generate ML models for bot classification, (ii) address the open problem of multi-class bot detection, identifying new types of bots, and share two new datasets towards this objective, (iii) provide new individual ML models for binary and multi-class bot classification and (iv) utilize explainable methods and provide comprehensive visualizations to clearly demonstrate interpretable results. Finally, we utilize all of the above in an effort to improve the so called Bot-Detective online service. Our experiments demonstrate high accuracy, explainability and scalability, comparable with the state of the art, despite multi-class classification challenges.

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!