Academic literature on the topic 'Model-agnostic methods'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Model-agnostic methods.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Model-agnostic methods":

1

Su, Houcheng, Weihao Luo, Daixian Liu, Mengzhu Wang, Jing Tang, Junyang Chen, Cong Wang, and Zhenghan Chen. "Sharpness-Aware Model-Agnostic Long-Tailed Domain Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 13 (March 24, 2024): 15091–99. http://dx.doi.org/10.1609/aaai.v38i13.29431.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Domain Generalization (DG) aims to improve the generalization ability of models trained on a specific group of source domains, enabling them to perform well on new, unseen target domains. Recent studies have shown that methods that converge to smooth optima can enhance the generalization performance of supervised learning tasks such as classification. In this study, we examine the impact of smoothness-enhancing formulations on domain adversarial training, which combines task loss and adversarial loss objectives. Our approach leverages the fact that converging to a smooth minimum with respect to task loss can stabilize the task loss and lead to better performance on unseen domains. Furthermore, we recognize that the distribution of objects in the real world often follows a long-tailed class distribution, resulting in a mismatch between machine learning models and our expectations of their performance on all classes of datasets with long-tailed class distributions. To address this issue, we consider the domain generalization problem from the perspective of the long-tail distribution and propose using the maximum square loss to balance different classes which can improve model generalizability. Our method's effectiveness is demonstrated through comparisons with state-of-the-art methods on various domain generalization datasets. Code: https://github.com/bamboosir920/SAMALTDG.
2

Pugnana, Andrea, and Salvatore Ruggieri. "A Model-Agnostic Heuristics for Selective Classification." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 9461–69. http://dx.doi.org/10.1609/aaai.v37i8.26133.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Selective classification (also known as classification with reject option) conservatively extends a classifier with a selection function to determine whether or not a prediction should be accepted (i.e., trusted, used, deployed). This is a highly relevant issue in socially sensitive tasks, such as credit scoring. State-of-the-art approaches rely on Deep Neural Networks (DNNs) that train at the same time both the classifier and the selection function. These approaches are model-specific and computationally expensive. We propose a model-agnostic approach, as it can work with any base probabilistic binary classification algorithm, and it can be scalable to large tabular datasets if the base classifier is so. The proposed algorithm, called SCROSS, exploits a cross-fitting strategy and theoretical results for quantile estimation to build the selection function. Experiments on real-world data show that SCROSS improves over existing methods.
3

Satrya, Wahyu Fadli, and Ji-Hoon Yun. "Combining Model-Agnostic Meta-Learning and Transfer Learning for Regression." Sensors 23, no. 2 (January 4, 2023): 583. http://dx.doi.org/10.3390/s23020583.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
For cases in which a machine learning model needs to be adapted to a new task, various approaches have been developed, including model-agnostic meta-learning (MAML) and transfer learning. In this paper, we investigate how the differences in the data distributions between the old tasks and the new target task impact performance in regression problems. By performing experiments, we discover that these differences greatly affect the relative performance of different adaptation methods. Based on this observation, we develop ensemble schemes combining multiple adaptation methods that can handle a wide range of data distribution differences between the old and new tasks, thus offering more stable performance for a wide range of tasks. For evaluation, we consider three regression problems of sinusoidal fitting, virtual reality motion prediction, and temperature forecasting. The evaluation results demonstrate that the proposed ensemble schemes achieve the best performance among the considered methods in most cases.
4

Atallah, Rasha Ragheb, Amirrudin Kamsin, Maizatul Akmar Ismail, and Ahmad Sami Al-Shamayleh. "NEURAL NETWORK WITH AGNOSTIC META-LEARNING MODEL FOR FACE-AGING RECOGNITION." Malaysian Journal of Computer Science 35, no. 1 (January 31, 2022): 56–69. http://dx.doi.org/10.22452/mjcs.vol35no1.4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Face recognition is one of the most approachable and accessible authentication methods. It is also accepted by users, as it is non-invasive. However, aging results in changes in the texture and shape of a face. Hence, age is one of the factors that decreases the accuracy of face recognition. Face aging, or age progression, is thus a significant challenge in face recognition methods. This paper presents the use of artificial neural network with model-agnostic meta-learning (ANN-MAML) for face-aging recognition. Model-agnostic meta-learning (MAML) is a meta-learning method used to train a model using parameters obtained from identical tasks with certain updates. This study aims to design and model a framework to recognize face aging based on artificial neural network. In addition, the face-aging recognition framework is evaluated against previous frameworks. Furthermore, the performance and the accuracy of ANN-MAML was evaluated using the CALFW (Cross-Age LFW) dataset. A comparison with other methods showed superior performance by ANN-MAML.
5

Zafar, Muhammad Rehman, and Naimul Khan. "Deterministic Local Interpretable Model-Agnostic Explanations for Stable Explainability." Machine Learning and Knowledge Extraction 3, no. 3 (June 30, 2021): 525–41. http://dx.doi.org/10.3390/make3030027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Local Interpretable Model-Agnostic Explanations (LIME) is a popular technique used to increase the interpretability and explainability of black box Machine Learning (ML) algorithms. LIME typically creates an explanation for a single prediction by any ML model by learning a simpler interpretable model (e.g., linear classifier) around the prediction through generating simulated data around the instance by random perturbation, and obtaining feature importance through applying some form of feature selection. While LIME and similar local algorithms have gained popularity due to their simplicity, the random perturbation methods result in shifts in data and instability in the generated explanations, where for the same prediction, different explanations can be generated. These are critical issues that can prevent deployment of LIME in sensitive domains. We propose a deterministic version of LIME. Instead of random perturbation, we utilize Agglomerative Hierarchical Clustering (AHC) to group the training data together and K-Nearest Neighbour (KNN) to select the relevant cluster of the new instance that is being explained. After finding the relevant cluster, a simple model (i.e., linear model or decision tree) is trained over the selected cluster to generate the explanations. Experimental results on six public (three binary and three multi-class) and six synthetic datasets show the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME), where we quantitatively determine the stability and faithfulness of DLIME compared to LIME.
6

Tak, Jae-Ho, and Byung-Woo Hong. "Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss." Electronics 13, no. 3 (January 29, 2024): 535. http://dx.doi.org/10.3390/electronics13030535.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Artificial intelligence (AI) technology has advanced significantly, now capable of performing tasks previously believed to be exclusive to skilled humans. However, AI models, in contrast to humans who can develop skills with relatively less data, often require substantial amounts of data to emulate human cognitive abilities in specific areas. In situations where adequate pre-training data is not available, meta-learning becomes a crucial method for enhancing generalization. The Model Agnostic Meta-Learning (MAML) algorithm, which employs second-order derivative calculations to fine-tune initial parameters for better starting points, plays a pivotal role in this area. However, the computational demand of this method can be challenging for modern models with a large number of parameters. The concept of the Approximate Hessian Effect is introduced in this context, examining the effectiveness of second-order derivatives in identifying initial parameters conducive to high generalization performance. The study suggests the use of cosine similarity and squared error (L2 loss) as a loss function within the Approximate Hessian Effect framework to modify gradient weights, aiming for more generalizable model parameters. Additionally, an algorithm that relies on first-order calculations is presented, designed to achieve performance levels comparable to MAML. This approach was tested and compared with traditional MAML methods using both the MiniImagenet dataset and a modified MNIST dataset. The results were analyzed to evaluate its efficiency. Compared to previous studies that achieved good performance using only the first derivative, this approach is more efficient because it does not require iterative loops to converge on additional loss functions. Additionally, there is potential for further performance enhancement through hyperparameter tuning.
7

Hou, Xiaoyu, Jihui Xu, Jinming Wu, and Huaiyu Xu. "Cross Domain Adaptation of Crowd Counting with Model-Agnostic Meta-Learning." Applied Sciences 11, no. 24 (December 17, 2021): 12037. http://dx.doi.org/10.3390/app112412037.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Counting people in crowd scenarios is extensively conducted in drone inspections, video surveillance, and public safety applications. Today, crowd count algorithms with supervised learning have improved significantly, but with a reliance on a large amount of manual annotation. However, in real world scenarios, different photo angles, exposures, location heights, complex backgrounds, and limited annotation data lead to supervised learning methods not working satisfactorily, plus many of them suffer from overfitting problems. To address the above issues, we focus on training synthetic crowd data and investigate how to transfer information to real-world datasets while reducing the need for manual annotation. CNN-based crowd-counting algorithms usually consist of feature extraction, density estimation, and count regression. To improve the domain adaptation in feature extraction, we propose an adaptive domain-invariant feature extracting module. Meanwhile, after taking inspiration from recent innovative meta-learning, we present a dynamic-β MAML algorithm to generate a density map in unseen novel scenes and render the density estimation model more universal. Finally, we use a counting map refiner to optimize the coarse density map transformation into a fine density map and then regress the crowd number. Extensive experiments show that our proposed domain adaptation- and model-generalization methods can effectively suppress domain gaps and produce elaborate density maps in cross-domain crowd-counting scenarios. We demonstrate that the proposals in our paper outperform current state-of-the-art techniques.
8

Chen, Zhouyuan, Zhichao Lian, and Zhe Xu. "Interpretable Model-Agnostic Explanations Based on Feature Relationships for High-Performance Computing." Axioms 12, no. 10 (October 23, 2023): 997. http://dx.doi.org/10.3390/axioms12100997.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In the explainable artificial intelligence (XAI) field, an algorithm or a tool can help people understand how a model makes a decision. And this can help to select important features to reduce computational costs to realize high-performance computing. But existing methods are usually used to visualize important features or highlight active neurons, and few of them show the importance of relationships between features. In recent years, some methods based on a white-box approach have taken relationships between features into account, but most of them can only work on some specific models. Although methods based on a black-box approach can solve the above problems, most of them can only be applied to tabular data or text data instead of image data. To solve these problems, we propose a local interpretable model-agnostic explanation approach based on feature relationships. This approach combines the relationships between features into the interpretation process and then visualizes the interpretation results. Finally, this paper conducts a lot of experiments to evaluate the correctness of relationships between features and evaluates this XAI method in terms of accuracy, fidelity, and consistency.
9

Hu, Cong, Kai Xu, Zhengqiu Zhu, Long Qin, and Quanjun Yin. "Multi-Agent Chronological Planning with Model-Agnostic Meta Reinforcement Learning." Applied Sciences 13, no. 16 (August 11, 2023): 9174. http://dx.doi.org/10.3390/app13169174.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this study, we propose an innovative approach to address a chronological planning problem involving the multiple agents required to complete tasks under precedence constraints. We model this problem as a stochastic game and solve it with multi-agent reinforcement learning algorithms. However, these algorithms necessitate relearning from scratch when confronted with changes in the chronological order of tasks, resulting in distinct stochastic games and consuming a substantial amount of time. To overcome this challenge, we present a novel framework that incorporates meta-learning into a multi-agent reinforcement learning algorithm. This approach enables the extraction of meta-parameters from past experiences, facilitating rapid adaptation to new tasks with altered chronological orders and circumventing the time-intensive nature of reinforcement learning. Then, the proposed framework is demonstrated through the implementation of a method named Reptile-MADDPG. The performance of the pre-trained model is evaluated using average rewards before and after fine-tuning. Our method, in two testing tasks, improves the average rewards from −44 to −37 through 10,000 steps of fine-tuning in two testing tasks, significantly surpassing the two baseline methods that only attained −51 and −44, respectively. The experimental results demonstrate the superior generalization capabilities of our method across various tasks, thus constituting a significant contribution towards the design of intelligent unmanned systems.
10

Xue, Tianfang, and Haibin Yu. "Unbiased Model-Agnostic Metalearning Algorithm for Learning Target-Driven Visual Navigation Policy." Computational Intelligence and Neuroscience 2021 (December 8, 2021): 1–12. http://dx.doi.org/10.1155/2021/5620751.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
As deep reinforcement learning methods have made great progress in the visual navigation field, metalearning-based algorithms are gaining more attention since they greatly improve the expansibility of moving agents. According to metatraining mechanism, typically an initial model is trained as a metalearner by existing navigation tasks and becomes well performed in new scenes through relatively few recursive trials. However, if a metalearner is overtrained on the former tasks, it may hardly achieve generalization on navigating in unfamiliar environments as the initial model turns out to be quite biased towards former ambient configuration. In order to train an impartial navigation model and enhance its generalization capability, we propose an Unbiased Model-Agnostic Metalearning (UMAML) algorithm towards target-driven visual navigation. Inspired by entropy-based methods, maximizing the uncertainty over output labels in classification tasks, we adopt inequality measures used in Economics as a concise metric to calculate the loss deviation across unfamiliar tasks. With succinctly minimizing the inequality of task losses, an unbiased navigation model without overperforming in particular scene types can be learnt based on Model-Agnostic Metalearning mechanism. The exploring agent complies with a more balanced update rule, able to gather navigation experience from training environments. Several experiments have been conducted, and results demonstrate that our approach outperforms other state-of-the-art metalearning navigation methods in generalization ability.

Dissertations / Theses on the topic "Model-agnostic methods":

1

Kanerva, Anton, and Fredrik Helgesson. "On the Use of Model-Agnostic Interpretation Methods as Defense Against Adversarial Input Attacks on Tabular Data." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20085.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Context. Machine learning is a constantly developing subfield within the artificial intelligence field. The number of domains in which we deploy machine learning models is constantly growing and the systems using these models spread almost unnoticeably in our daily lives through different devices. In previous years, lots of time and effort has been put into increasing the performance of these models, overshadowing the significant risks of attacks targeting the very core of the systems, the trained machine learning models themselves. A specific attack with the aim of fooling the decision-making of a model, called the adversarial input attack, has almost exclusively been researched for models processing image data. However, the threat of adversarial input attacks stretches beyond systems using image data, to e.g the tabular domain which is the most common data domain used in the industry. Methods used for interpreting complex machine learning models can help humans understand the behavior and predictions of these complex machine learning systems. Understanding the behavior of a model is an important component in detecting, understanding and mitigating vulnerabilities of the model. Objectives. This study aims to reduce the research gap of adversarial input attacks and defenses targeting machine learning models in the tabular data domain. The goal of this study is to analyze how model-agnostic interpretation methods can be used in order to mitigate and detect adversarial input attacks on tabular data. Methods. The goal is reached by conducting three consecutive experiments where model interpretation methods are analyzed and adversarial input attacks are evaluated as well as visualized in terms of perceptibility. Additionally, a novel method for adversarial input attack detection based on model interpretation is proposed together with a novel way of defensively using feature selection to reduce the attack vector size. Results. The adversarial input attack detection showed state-of-the-art results with an accuracy over 86%. The proposed feature selection-based mitigation technique was successful in hardening the model from adversarial input attacks by reducing their scores by 33% without decreasing the performance of the model. Conclusions. This study contributes with satisfactory and useful methods for adversarial input attack detection and mitigation as well as methods for evaluating and visualizing the imperceptibility of attacks on tabular data.
Kontext. Maskininlärning är ett område inom artificiell intelligens som är under konstant utveckling. Mängden domäner som vi sprider maskininlärningsmodeller i växer sig allt större och systemen sprider sig obemärkt nära inpå våra dagliga liv genom olika elektroniska enheter. Genom åren har mycket tid och arbete lagts på att öka dessa modellers prestanda vilket har överskuggat risken för sårbarheter i systemens kärna, den tränade modellen. En relativt ny attack, kallad "adversarial input attack", med målet att lura modellen till felaktiga beslutstaganden har nästan uteslutande forskats på inom bildigenkänning. Men, hotet som adversarial input-attacker utgör sträcker sig utom ramarna för bilddata till andra datadomäner som den tabulära domänen vilken är den vanligaste datadomänen inom industrin. Metoder för att tolka komplexa maskininlärningsmodeller kan hjälpa människor att förstå beteendet hos dessa komplexa maskininlärningssystem samt de beslut som de tar. Att förstå en modells beteende är en viktig komponent för att upptäcka, förstå och mitigera sårbarheter hos modellen. Syfte. Den här studien försöker reducera det forskningsgap som adversarial input-attacker och motsvarande försvarsmetoder i den tabulära domänen utgör. Målet med denna studie är att analysera hur modelloberoende tolkningsmetoder kan användas för att mitigera och detektera adversarial input-attacker mot tabulär data. Metod. Det uppsatta målet nås genom tre på varandra följande experiment där modelltolkningsmetoder analyseras, adversarial input-attacker utvärderas och visualiseras samt där en ny metod baserad på modelltolkning föreslås för detektion av adversarial input-attacker tillsammans med en ny mitigeringsteknik där feature selection används defensivt för att minska attackvektorns storlek. Resultat. Den föreslagna metoden för detektering av adversarial input-attacker visar state-of-the-art-resultat med över 86% träffsäkerhet. Den föreslagna mitigeringstekniken visades framgångsrik i att härda modellen mot adversarial input attacker genom att minska deras attackstyrka med 33% utan att degradera modellens klassifieringsprestanda. Slutsats. Denna studie bidrar med användbara metoder för detektering och mitigering av adversarial input-attacker såväl som metoder för att utvärdera och visualisera svårt förnimbara attacker mot tabulär data.
2

Danesh, Alaghehband Tina Sadat. "Vers une conception robuste en ingénierie des procédés. Utilisation de modèles agnostiques de l'interprétabilité en apprentissage automatique." Electronic Thesis or Diss., Toulouse, INPT, 2023. http://www.theses.fr/2023INPT0138.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La conception de processus robustes revêt une importance capitale dans divers secteurs, tels que le génie chimique et le génie des procédés. La nature de la robustesse consiste à s'assurer qu'un processus peut constamment produire les résultats souhaités pour les décideurs, même lorsqu'ils sont confrontés à une variabilité et à une incertitude intrinsèques. Un processus conçu de manière robuste améliore non seulement la qualité et la fiabilité des produits, mais réduit également de manière significative le risque de défaillances coûteuses, de temps d'arrêt et de rappels de produits. Il améliore l'efficacité et la durabilité en minimisant les déviations et les défaillances du processus. Il existe différentes méthodes pour améliorer la robustesse du système, telles que la conception d'expériences, l'optimisation robuste et la méthodologie de la surface de réponse. Parmi les méthodes de conception robuste, l'analyse de sensibilité pourrait être appliquée comme technique de soutien pour mieux comprendre comment les modifications des paramètres d'entrée affectent les performances et la robustesse. En raison du développement rapide en science de l’ingénieure, les modèles mécanistiques ne captant pas certaines parties des systèmes complexe, peuvent ne pas être l'option la plus appropriée pour d'analyse de sensibilité. Ceci nous amène à envisager l'application de modèles d'apprentissage automatique et la combiner avec l’analyse de sensibilité. Par ailleurs, la question de l'interprétabilité des modèles d'apprentissage automatique a gagné en importance, il est de plus en plus nécessaire de comprendre comment ces modèles parviennent à leurs prédictions ou à leurs décisions et comment les différents paramètres sont liés. Étant donné que leurs performances dépassent constamment celles des modèles mécanistiques, fournir des explications, des justifications et des informations sur les prédictions des modèles de ML permettent non seulement de renforcer leur fiabilité et leur équité, mais aussi de donner aux ingénieurs les moyens de prendre des décisions en connaissance de cause, d'identifier les biais, de détecter les erreurs et d'améliorer les performances globales et la fiabilité des systèmes. Diverses méthodes sont disponibles pour traiter les différents aspects de l'interprétabilité, ces dernières reposent sur des approches spécifiques à un modèle et sur des méthodes agnostiques aux modèles.Dans cette thèse, notre objectif est d'améliorer l'interprétabilité de diverses méthodes de ML tout en maintenant un équilibre entre la précision dans la prédiction et l'interprétabilité afin de garantir aux décideurs que les modèles peuvent être considérés comme robustes. Simultanément, nous voulons démontrer que les décideurs peuvent faire confiance aux prédictions fournies par les modèles ML. Les outils d’interprétabilité ont été testés pour différents scénarios d'application, y compris les modèles basés sur des équations, les modèles hybrides et les modèles basés sur des données. Pour atteindre cet objectif, nous avons appliqué à diverses applications plusieurs méthodes agnostiques aux modèles, telles que partial dependence plots, individual conditional expectations, accumulated local effects, etc
Robust process design holds paramount importance in various industries, such as process and chemical engineering. The nature of robustness lies in ensuring that a process can consistently deliver desired outcomes for decision-makers and/or stakeholders, even when faced with intrinsic variability and uncertainty. A robustly designed process not only enhances product quality and reliability but also significantly reduces the risk of costly failures, downtime, and product recalls. It enhances efficiency and sustainability by minimizing process deviations and failures. There are different methods to approach the robustness of a complex system, such as the design of experiments, robust optimization, and response surface methodology. Among the robust design methods, sensitivity analysis could be applied as a supportive technique to gain insights into how changes in input parameters affect performance and robustness. Due to the rapid development and advancement of engineering science, the use of physical models for sensitivity analysis presents several challenges, such as unsatisfied assumptions and computation time. These problems lead us to consider applying machine learning (ML) models to complex processes. Although, the issue of interpretability in ML has gained increasing importance, there is a growing need to understand how these models arrive at their predictions or decisions and how different parameters are related. As their performance consistently surpasses that of other models, such as knowledge-based models, the provision of explanations, justifications, and insights into the workings of ML models not only enhances their trustworthiness and fairness but also empowers stakeholders to make informed decisions, identify biases, detect errors, and improve the overall performance and reliability of the process. Various methods are available to address interpretability, including model-specific and model-agnostic methods. In this thesis, our objective is to enhance the interpretability of various ML methods while maintaining a balance between accuracy and interpretability to ensure decision-makers or stakeholders that our model or process could be considered robust. Simultaneously, we aim to demonstrate that users can trust ML model predictions guaranteed by model-agnostic techniques, which work across various scenarios, including equation-based, hybrid, and data-driven models. To achieve this goal, we applied several model-agnostic methods, such as partial dependence plots, individual conditional expectations, accumulated local effects, etc., to diverse applications
3

Neves, Maria Inês Lourenço das. "Opening the black-box of artificial intelligence predictions on clinical decision support systems." Master's thesis, 2021. http://hdl.handle.net/10362/126699.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cardiovascular diseases are the leading global death cause. Their treatment and prevention rely on electrocardiogram interpretation, which is dependent on the physician’s variability. Subjectiveness is intrinsic to electrocardiogram interpretation and hence, prone to errors. To assist physicians in making precise and thoughtful decisions, artificial intelligence is being deployed to develop models that can interpret extent datasets and provide accurate decisions. However, the lack of interpretability of most machine learning models stands as one of the drawbacks of their deployment, particularly in the medical domain. Furthermore, most of the currently deployed explainable artificial intelligence methods assume independence between features, which means temporal independence when dealing with time series. The inherent characteristic of time series cannot be ignored as it carries importance for the human decision making process. This dissertation focuses on the explanation of heartbeat classification using several adaptations of state-of-the-art model-agnostic methods, to locally explain time series classification. To address the explanation of time series classifiers, a preliminary conceptual framework is proposed, and the use of the derivative is suggested as a complement to add temporal dependency between samples. The results were validated on an extent public dataset, through the 1-D Jaccard’s index, which consists of the comparison of the subsequences extracted from an interpretable model and the explanation methods used. Secondly, through the performance’s decrease, to evaluate whether the explanation fits the model’s behaviour. To assess models with distinct internal logic, the validation was conducted on a more transparent model and more opaque one in both binary and multiclass situation. The results show the promising use of including the signal’s derivative to introduce temporal dependency between samples in the explanations, for models with simpler internal logic.
As doenças cardiovasculares são, a nível mundial, a principal causa de morte e o seu tratamento e prevenção baseiam-se na interpretação do electrocardiograma. A interpretação do electrocardiograma, feita por médicos, é intrinsecamente subjectiva e, portanto, sujeita a erros. De modo a apoiar a decisão dos médicos, a inteligência artificial está a ser usada para desenvolver modelos com a capacidade de interpretar extensos conjuntos de dados e fornecer decisões precisas. No entanto, a falta de interpretabilidade da maioria dos modelos de aprendizagem automática é uma das desvantagens do recurso à mesma, principalmente em contexto clínico. Adicionalmente, a maioria dos métodos inteligência artifical explicável assumem independência entre amostras, o que implica a assunção de independência temporal ao lidar com séries temporais. A característica inerente das séries temporais não pode ser ignorada, uma vez que apresenta importância para o processo de tomada de decisão humana. Esta dissertação baseia-se em inteligência artificial explicável para tornar inteligível a classificação de batimentos cardíacos, através da utilização de várias adaptações de métodos agnósticos do estado-da-arte. Para abordar a explicação dos classificadores de séries temporais, propõe-se uma taxonomia preliminar, e o uso da derivada como um complemento para adicionar dependência temporal entre as amostras. Os resultados foram validados para um conjunto extenso de dados públicos, por meio do índice de Jaccard em 1-D, com a comparação das subsequências extraídas de um modelo interpretável e os métodos inteligência artificial explicável utilizados, e a análise de qualidade, para avaliar se a explicação se adequa ao comportamento do modelo. De modo a avaliar modelos com lógicas internas distintas, a validação foi realizada usando, por um lado, um modelo mais transparente e, por outro, um mais opaco, tanto numa situação de classificação binária como numa situação de classificação multiclasse. Os resultados mostram o uso promissor da inclusão da derivada do sinal para introduzir dependência temporal entre as amostras nas explicações fornecidas, para modelos com lógica interna mais simples.

Book chapters on the topic "Model-agnostic methods":

1

Gianfagna, Leonida, and Antonio Di Cecco. "Model-Agnostic Methods for XAI." In Explainable AI with Python, 81–113. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-68640-6_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gunel, Kadir, and Mehmet Fatih Amasyali. "Model Agnostic Knowledge Transfer Methods for Sentence Embedding Models." In 2nd International Congress of Electrical and Computer Engineering, 3–16. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-52760-9_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Molnar, Christoph, Gunnar König, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A. Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, and Bernd Bischl. "General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models." In xxAI - Beyond Explainable AI, 39–68. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-04083-2_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractAn increasing number of model-agnostic interpretation techniques for machine learning (ML) models such as partial dependence plots (PDP), permutation feature importance (PFI) and Shapley values provide insightful model interpretations, but can lead to wrong conclusions if applied incorrectly. We highlight many general pitfalls of ML model interpretation, such as using interpretation techniques in the wrong context, interpreting models that do not generalize well, ignoring feature dependencies, interactions, uncertainty estimates and issues in high-dimensional settings, or making unjustified causal interpretations, and illustrate them with examples. We focus on pitfalls for global methods that describe the average model behavior, but many pitfalls also apply to local methods that explain individual predictions. Our paper addresses ML practitioners by raising awareness of pitfalls and identifying solutions for correct model interpretation, but also addresses ML researchers by discussing open issues for further research.
4

Baniecki, Hubert, Wojciech Kretowicz, and Przemyslaw Biecek. "Fooling Partial Dependence via Data Poisoning." In Machine Learning and Knowledge Discovery in Databases, 121–36. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-26409-2_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractMany methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. It turns out that such explanations are not robust nor trustworthy, and they can be fooled. This paper presents techniques for attacking Partial Dependence (plots, profiles, PDP), which are among the most popular methods of explaining any predictive model trained on tabular data. We showcase that PD can be manipulated in an adversarial manner, which is alarming, especially in financial or medical applications where auditability became a must-have trait supporting black-box machine learning. The fooling is performed via poisoning the data to bend and shift explanations in the desired direction using genetic and gradient algorithms. We believe this to be the first work using a genetic algorithm for manipulating explanations, which is transferable as it generalizes both ways: in a model-agnostic and an explanation-agnostic manner.
5

Nguyen, Thu Trang, Thach Le Nguyen, and Georgiana Ifrim. "A Model-Agnostic Approach to Quantifying the Informativeness of Explanation Methods for Time Series Classification." In Advanced Analytics and Learning on Temporal Data, 77–94. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-65742-0_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Krishna, Siddharth, Michael Emmi, Constantin Enea, and Dejan Jovanović. "Verifying Visibility-Based Weak Consistency." In Programming Languages and Systems, 280–307. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-44914-8_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractMultithreaded programs generally leverage efficient and thread-safe concurrent objects like sets, key-value maps, and queues. While some concurrent-object operations are designed to behave atomically, each witnessing the atomic effects of predecessors in a linearization order, others forego such strong consistency to avoid complex control and synchronization bottlenecks. For example, contains (value) methods of key-value maps may iterate through key-value entries without blocking concurrent updates, to avoid unwanted performance bottlenecks, and consequently overlook the effects of some linearization-order predecessors. While such weakly-consistent operations may not be atomic, they still offer guarantees, e.g., only observing values that have been present.In this work we develop a methodology for proving that concurrent object implementations adhere to weak-consistency specifications. In particular, we consider (forward) simulation-based proofs of implementations against relaxed-visibility specifications, which allow designated operations to overlook some of their linearization-order predecessors, i.e., behaving as if they never occurred. Besides annotating implementation code to identify linearization points, i.e., points at which operations’ logical effects occur, we also annotate code to identify visible operations, i.e., operations whose effects are observed; in practice this annotation can be done automatically by tracking the writers to each accessed memory location. We formalize our methodology over a general notion of transition systems, agnostic to any particular programming language or memory model, and demonstrate its application, using automated theorem provers, by verifying models of Java concurrent object implementations.
7

Gunasekaran, Abirami, Minsi Chen, Richard Hill, and Keith McCabe. "Method Agnostic Model Class Reliance (MAMCR) Explanation of Multiple Machine Learning Models." In Soft Computing and Its Engineering Applications, 56–71. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-27609-5_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lampridis, Orestis, Riccardo Guidotti, and Salvatore Ruggieri. "Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars." In Discovery Science, 357–73. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-61527-7_24.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract We present xspells, a model-agnostic local approach for explaining the decisions of a black box model for sentiment classification of short texts. The explanations provided consist of a set of exemplar sentences and a set of counter-exemplar sentences. The former are examples classified by the black box with the same label as the text to explain. The latter are examples classified with a different label (a form of counter-factuals). Both are close in meaning to the text to explain, and both are meaningful sentences – albeit they are synthetically generated. xspells generates neighbors of the text to explain in a latent space using Variational Autoencoders for encoding text and decoding latent instances. A decision tree is learned from randomly generated neighbors, and used to drive the selection of the exemplars and counter-exemplars. We report experiments on two datasets showing that xspells outperforms the well-known lime method in terms of quality of explanations, fidelity, and usefulness, and that is comparable to it in terms of stability.
9

Sim, Min K. "Explanation using model-agnostic methods." In Human-Centered Artificial Intelligence, 17–31. Elsevier, 2022. http://dx.doi.org/10.1016/b978-0-323-85648-5.00008-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Tiwari, Ravi Shekhar. "Hate speech detection using LSTM and explanation by LIME (local interpretable model-agnostic explanations)." In Computational Intelligence Methods for Sentiment Analysis in Natural Language Processing Applications, 93–110. Elsevier, 2024. http://dx.doi.org/10.1016/b978-0-443-22009-8.00005-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Model-agnostic methods":

1

Menon, Rakesh, Kerem Zaman, and Shashank Srivastava. "MaNtLE: Model-agnostic Natural Language Explainer." In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.emnlp-main.832.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sikder, Md Nazmul Kabir, Feras A. Batarseh, Pei Wang, and Nitish Gorentala. "Model-Agnostic Scoring Methods for Artificial Intelligence Assurance." In 2022 IEEE 29th Annual Software Technology Conference (STC). IEEE, 2022. http://dx.doi.org/10.1109/stc55697.2022.00011.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Tayal, Kshitij, Rahul Ghosh, and Vipin Kumar. "Model-agnostic Methods for Text Classification with Inherent Noise." In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track. Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.coling-industry.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Tayal, Kshitij, Rahul Ghosh, and Vipin Kumar. "Model-agnostic Methods for Text Classification with Inherent Noise." In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track. Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.coling-industry.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Sandu, Marian Gabriel, and Ștefan Trăușan-Matu. "Comparing model-agnostic and model-specific XAI methods in Natural Language Processing." In RoCHI - International Conference on Human-Computer Interaction. MATRIX ROM, 2022. http://dx.doi.org/10.37789/rochi.2022.1.1.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Letrache, Khadija, and Mohammed Ramdani. "Explainable Artificial Intelligence: A Review and Case Study on Model-Agnostic Methods." In 2023 14th International Conference on Intelligent Systems: Theories and Applications (SITA). IEEE, 2023. http://dx.doi.org/10.1109/sita60746.2023.10373722.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Emelin, Denis, Ivan Titov, and Rico Sennrich. "Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.616.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Regenwetter, Lyle, Yazan Abu Obaideh, and Faez Ahmed. "Counterfactuals for Design: A Model-Agnostic Method for Design Recommendations." In ASME 2023 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2023. http://dx.doi.org/10.1115/detc2023-117216.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract We introduce Multi-Objective Counter/actuals for Design (MCD), a novel method for counterfactual optimization in design problems. Counterfactuals are hypothetical situations that can lead to a different decision or choice. In this paper, the authors frame the counterfactual search problem as a design recommendation tool that can help identify modifications to a design, leading to better functional performance. MCD improves upon existing counterfactual search methods by supporting multi-objective queries, which are crucial in design problems, and by decoupling the counterfactual search and sampling processes, thus enhancing efficiency and facilitating objective tradeoff visualization. The paper demonstrates MCD’s core functionality using a two-dimensional test case, followed by three case studies of bicycle design that showcase MCD’s effectiveness in real-world design problems. In the first case study, MCD excels at recommending modifications to query designs that can significantly enhance functional performance, such as weight savings and improvements to the structural safety factor. The second case study demonstrates that MCD can work with a pre-trained language model to suggest design changes based on a subjective text prompt effectively. Lastly, the authors task MCD with increasing a query design’s similarity to a target image and text prompt while simultaneously reducing weight and improving structural performance, demonstrating MCD’s performance on a complex multimodal query. Overall, MCD has the potential to provide valuable recommendations for practitioners and design automation researchers looking for answers to their “What if” questions by exploring hypothetical design modifications and their impact on multiple design objectives.
9

Ouyang, Linshu, Yongzheng Zhang, Hui Liu, Yige Chen, and Yipeng Wang. "Gated POS-Level Language Model for Authorship Verification." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/557.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Authorship verification is an important problem that has many applications. The state-of-the-art deep authorship verification methods typically leverage character-level language models to encode author-specific writing styles. However, they often fail to capture syntactic level patterns, leading to sub-optimal accuracy in cross-topic scenarios. Also, due to imperfect cross-author parameter sharing, it's difficult for them to distinguish author-specific writing style from common patterns, leading to data-inefficient learning. This paper introduces a novel POS-level (Part of Speech) gated RNN based language model to effectively learn the author-specific syntactic styles. The author-agnostic syntactic information obtained from the POS tagger pre-trained on large external datasets greatly reduces the number of effective parameters of our model, enabling the model to learn accurate author-specific syntactic styles with limited training data. We also utilize a gated architecture to learn the common syntactic writing styles with a small set of shared parameters and let the author-specific parameters focus on each author's special syntactic styles. Extensive experimental results show that our method achieves significantly better accuracy than state-of-the-art competing methods, especially in cross-topic scenarios (over 5\% in terms of AUC-ROC).
10

Zhan, Guodong David, Mohammed J. Dossary, Trieu Phat Luu, Huang Xu, Ted Furlong, and John Bomidi. "On Field Implementation of Real-Time Bit-Wear Estimation with Bit Agnostic Deep Learning Artificial Intelligence Model Along with Physics-Hybrid Features." In SPE/IADC Middle East Drilling Technology Conference and Exhibition. SPE, 2023. http://dx.doi.org/10.2118/214603-ms.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract The estimation of bit wear during real-time operation plays a crucial role in bit trip planning and drilling optimization. Estimates by human learnings can be highly subjective and convoluted by changes in formation and drilling data. Conventional methods using physics-based model and supervised machine learning are time consuming and accuracy is significantly limited by the labelled data available. Moreover, those approaches do not consider the entire real-time time/depth series. In this study, we present a real-time field-validated bit agnostic wear model using unsupervised deep learning method to overcome these challenges. The framework is of unsupervised learning and representation of LWD sub-/surface drilling data) time/depth series data to lower-dimensional representation (latent) space with reconstruction ability and facilitating the downstream task e.g., bit wear estimation. Specifically, a bi-directional Long short-term Memory-based Variational Autoencoder (biLSTM-VAE) projects raw drilling data into a latent space in which the real-time bit-wear can be estimated through classification of the incoming real time data in the latent space. The deep neural network was trained in an unsupervised manner and the bit-wear estimation is an end-to-end process, and then implemented for evaluation in a real time lateral. The model training results had significant separation of bit-wear states in the lower dimensional latent space projected by the trained model, suggesting the feasibility of the real-time monitoring and tracking of bit wear states in the latent space. We then employed the trained deep learning model to estimate the bit wear in the real-time drilling for seven runs in a lateral. The predicted bit wear for all evaluation field runs were closely match the actual dull grade with the error smaller than 1.0. Among the seven prediction values, five of them agreed exactly with the actual field dull grading. Moreover, real time data of bits from different manufacturers and their results demonstrate the model to be bit-agnostic. To the best of our knowledge, this is the first field implementation of AI-assisted model for the real-time bit wear estimation that is both trained in an unsupervised manner in end-to-end process and AI predicted on completely unseen time/depth series data. Moreover, commonly available real time data is selected to ensure ease of applicability. Our approach also introduces a novel method of estimating bit wear based on the tracking of its trajectory in the latent space including the memory as opposed to isolated events. This helps improve the efficiency in drilling operations and can significantly affect economics of well engineering. As compared to traditional physic-based models that have been applied to estimate the bit wear, the proposed AI model is bit agnostic and is applicable to wide range of applications for drilling optimization

Reports on the topic "Model-agnostic methods":

1

Walizer, Laura, Robert Haehnel, Luke Allen, and Yonghu Wenren. Application of multi-fidelity methods to rotorcraft performance assessment. Engineer Research and Development Center (U.S.), May 2024. http://dx.doi.org/10.21079/11681/48474.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We present a Python-based multi-fidelity tool to estimate rotorcraft performance metrics. We use Gaussian-Process regression (GPR) methods to adaptively build a surrogate model using a small number of high-fidelity CFD points to improve estimates of performance metrics from a medium-fidelity comprehensive analysis model. To include GPR methods in our framework, we used the EmuKit Python package. Our framework adaptively chooses new high-fidelity points to run in regions where the model variance is high. These high-fidelity points are used to update the GPR model; convergence is reached when model variance is below a pre-determined level. To efficiently use our framework on large computer clusters, we implemented this in Galaxy Simulation Builder, an analysis tool that is designed to work on large parallel computing environments. The program is modular, and is designed to be agnostic to the number and names of dependent variables and to the number and identifying labels of the fidelity levels. We demonstrate our multi-fidelity modeling framework on a rotorcraft collective sweep (hover) simulation and compare the accuracy and time savings of the GPR model to that of a simulation run with CFD only.
2

Yu, Haichao, Haoxiang Li, Honghui Shi, Thomas S. Huang, and Gang Hua. Any-Precision Deep Neural Networks. Web of Open Science, December 2020. http://dx.doi.org/10.37686/ejai.v1i1.82.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We present Any-Precision Deep Neural Networks (Any- Precision DNNs), which are trained with a new method that empowers learned DNNs to be flexible in any numerical precision during inference. The same model in runtime can be flexibly and directly set to different bit-width, by trun- cating the least significant bits, to support dynamic speed and accuracy trade-off. When all layers are set to low- bits, we show that the model achieved accuracy compara- ble to dedicated models trained at the same precision. This nice property facilitates flexible deployment of deep learn- ing models in real-world applications, where in practice trade-offs between model accuracy and runtime efficiency are often sought. Previous literature presents solutions to train models at each individual fixed efficiency/accuracy trade-off point. But how to produce a model flexible in runtime precision is largely unexplored. When the demand of efficiency/accuracy trade-off varies from time to time or even dynamically changes in runtime, it is infeasible to re-train models accordingly, and the storage budget may forbid keeping multiple models. Our proposed framework achieves this flexibility without performance degradation. More importantly, we demonstrate that this achievement is agnostic to model architectures. We experimentally validated our method with different deep network backbones (AlexNet-small, Resnet-20, Resnet-50) on different datasets (SVHN, Cifar-10, ImageNet) and observed consistent results.

To the bibliography