Letteratura scientifica selezionata sul tema "Bandit algorithm"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Bandit algorithm".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Articoli di riviste sul tema "Bandit algorithm":

1

Ciucanu, Radu, Pascal Lafourcade, Gael Marcadet e Marta Soare. "SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits". Journal of Artificial Intelligence Research 73 (23 febbraio 2022): 737–65. http://dx.doi.org/10.1613/jair.1.13163.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The multi-armed bandit is a reinforcement learning model where a learning agent repeatedly chooses an action (pull a bandit arm) and the environment responds with a stochastic outcome (reward) coming from an unknown distribution associated with the chosen arm. Bandits have a wide-range of application such as Web recommendation systems. We address the cumulative reward maximization problem in a secure federated learning setting, where multiple data owners keep their data stored locally and collaborate under the coordination of a central orchestration server. We rely on cryptographic schemes and propose Samba, a generic framework for Secure federAted Multi-armed BAndits. Each data owner has data associated to a bandit arm and the bandit algorithm has to sequentially select which data owner is solicited at each time step. We instantiate Samba for five bandit algorithms. We show that Samba returns the same cumulative reward as the nonsecure versions of bandit algorithms, while satisfying formally proven security properties. We also show that the overhead due to cryptographic primitives is linear in the size of the input, which is confirmed by our proof-of-concept implementation.
2

Zhou, Huozhi, Lingda Wang, Lav Varshney e Ee-Peng Lim. "A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 04 (3 aprile 2020): 6933–40. http://dx.doi.org/10.1609/aaai.v34i04.6176.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an efficient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√NKT log T), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of Ω(√NKT), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewise-stationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is Ω(√T). Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of GLR-CUCB compared to other state-of-the-art algorithms.
3

Azizi, Javad, Branislav Kveton, Mohammad Ghavamzadeh e Sumeet Katariya. "Meta-Learning for Simple Regret Minimization". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 6 (26 giugno 2023): 6709–17. http://dx.doi.org/10.1609/aaai.v37i6.25823.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We develop a meta-learning framework for simple regret minimization in bandits. In this framework, a learning agent interacts with a sequence of bandit tasks, which are sampled i.i.d. from an unknown prior distribution, and learns its meta-parameters to perform better on future tasks. We propose the first Bayesian and frequentist meta-learning algorithms for this setting. The Bayesian algorithm has access to a prior distribution over the meta-parameters and its meta simple regret over m bandit tasks with horizon n is mere O(m / √n). On the other hand, the meta simple regret of the frequentist algorithm is O(n√m + m/ √n). While its regret is worse, the frequentist algorithm is more general because it does not need a prior distribution over the meta-parameters. It can also be analyzed in more settings. We instantiate our algorithms for several classes of bandit problems. Our algorithms are general and we complement our theory by evaluating them empirically in several environments.
4

Kuroki, Yuko, Liyuan Xu, Atsushi Miyauchi, Junya Honda e Masashi Sugiyama. "Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback". Neural Computation 32, n. 9 (settembre 2020): 1733–73. http://dx.doi.org/10.1162/neco_a_01299.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We study the problem of stochastic multiple-arm identification, where an agent sequentially explores a size-[Formula: see text] subset of arms (also known as a super arm) from given [Formula: see text] arms and tries to identify the best super arm. Most work so far has considered the semi-bandit setting, where the agent can observe the reward of each pulled arm or assumed each arm can be queried at each round. However, in real-world applications, it is costly or sometimes impossible to observe a reward of individual arms. In this study, we tackle the full-bandit setting, where only a noisy observation of the total sum of a super arm is given at each pull. Although our problem can be regarded as an instance of the best arm identification in linear bandits, a naive approach based on linear bandits is computationally infeasible since the number of super arms [Formula: see text] is exponential. To cope with this problem, we first design a polynomial-time approximation algorithm for a 0-1 quadratic programming problem arising in confidence ellipsoid maximization. Based on our approximation algorithm, we propose a bandit algorithm whose computation time is [Formula: see text](log [Formula: see text]), thereby achieving an exponential speedup over linear bandit algorithms. We provide a sample complexity upper bound that is still worst-case optimal. Finally, we conduct experiments on large-scale data sets with more than 10[Formula: see text] super arms, demonstrating the superiority of our algorithms in terms of both the computation time and the sample complexity.
5

Li, Youxuan. "Improvement of the recommendation system based on the multi-armed bandit algorithm". Applied and Computational Engineering 36, n. 1 (22 gennaio 2024): 237–41. http://dx.doi.org/10.54254/2755-2721/36/20230453.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In order to effectively solve common problems of the recommendation system, such as the cold start problem and dynamic data modeling problem, the multi-armed bandit (MAB) algorithm, the collaborative filtering (CF) algorithm, and the user information feedback are applied by researchers to update the recommendation model online and in time. In other words, the cold start problem of the recommendation system is transformed into an issue of exploration and utilization. The MAB algorithm is used, user features are introduced as content, and the synergy between users is further considered. In this paper, the author studies the improvement of the recommendation system based on the multi-armed bandit algorithm. The Liner Upper Confidence Bound (LinUCB), Collaborative Filtering Bandits (COFIBA), and Context-Aware clustering of Bandits (CAB) algorithms are analyzed. It is found that the MAB algorithm can get a good maximum total revenue regardless of the content value after going through the cold start stage. In the case of a particularly large amount of content, the CAB algorithm achieves the greatest effect.
6

Liu, Zizhuo. "Investigation of progress and application related to Multi-Armed Bandit algorithms". Applied and Computational Engineering 37, n. 1 (22 gennaio 2024): 155–59. http://dx.doi.org/10.54254/2755-2721/37/20230496.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This paper discusses four Multi-armed Bandit algorithms: Explore-then-Commit (ETC), Epsilon-Greedy, Upper Confidence Bound (UCB), and Thompson Sampling algorithm. ETC algorithm aims to spend the majority of rounds on the best arm, but it can lead to a suboptimal outcome if the environment changes rapidly. The Epsilon-Greedy algorithm is designed to explore and exploit simultaneously, while it often tries sub-optimal arm even after the algorithm finds the best arm. Thus, the Epsilon-Greedy algorithm performs well when the environment continuously changes. UCB algorithm is one of the most used Multi-armed Bandit algorithms because it can rapidly narrow the potential optimal decisions in a wide range of scenarios; however, the algorithm can be influenced by some specific pattern of reward distribution or noise presenting in the environment. Thompson Sampling algorithm is also one of the most common algorithms in the Multi-armed Bandit algorithm due to its simplicity, effectiveness, and adaptability to various reward distributions. The Thompson Sampling algorithm performs well in multiple scenarios because it explores and exploits simultaneously, but its variance is greater than the three algorithms mentioned above. Today, Multi-armed bandit algorithms are widely used in advertisement, health care, and website and app optimization. Finally, the Multi-armed Bandit algorithms are rapidly replacing the traditional algorithms; in the future, the advanced Multi-armed Bandit algorithm, contextual Multi-armed Bandit algorithm, will gradually replace the old one.
7

Agarwal, Mridul, Vaneet Aggarwal, Abhishek Kumar Umrawal e Chris Quinn. "DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 8 (18 maggio 2021): 6557–65. http://dx.doi.org/10.1609/aaai.v35i8.16812.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We consider the bandit problem of selecting K out of N arms at each time step. The joint reward can be a non-linear function of the rewards of the selected individual arms. The direct use of a multi-armed bandit algorithm requires choosing among all possible combinations, making the action space large. To simplify the problem, existing works on combinatorial bandits typically assume feedback as a linear function of individual rewards. In this paper, we prove the lower bound for top-K subset selection with bandit feedback with possibly correlated rewards. We present a novel algorithm for the combinatorial setting without using individual arm feedback or requiring linearity of the reward function. Additionally, our algorithm works on correlated rewards of individual arms. Our algorithm, aDaptive Accept RejecT (DART), sequentially finds good arms and eliminates bad arms based on confidence bounds. DART is computationally efficient and uses storage linear in N. Further, DART achieves a regret bound of Õ(K√KNT) for a time horizon T, which matches the lower bound in bandit feedback up to a factor of √log 2NT. When applied to the problem of cross-selling optimization and maximizing the mean of individual rewards, the performance of the proposed algorithm surpasses that of state-of-the-art algorithms. We also show that DART significantly outperforms existing methods for both linear and non-linear joint reward environments.
8

Xue, Bo, Ji Cheng, Fei Liu, Yimu Wang e Qingfu Zhang. "Multiobjective Lipschitz Bandits under Lexicographic Ordering". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 15 (24 marzo 2024): 16238–46. http://dx.doi.org/10.1609/aaai.v38i15.29558.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This paper studies the multiobjective bandit problem under lexicographic ordering, wherein the learner aims to simultaneously maximize ? objectives hierarchically. The only existing algorithm for this problem considers the multi-armed bandit model, and its regret bound is O((KT)^(2/3)) under a metric called priority-based regret. However, this bound is suboptimal, as the lower bound for single objective multi-armed bandits is Omega(KlogT). Moreover, this bound becomes vacuous when the arm number K is infinite. To address these limitations, we investigate the multiobjective Lipschitz bandit model, which allows for an infinite arm set. Utilizing a newly designed multi-stage decision-making strategy, we develop an improved algorithm that achieves a general regret bound of O(T^((d_z^i+1)/(d_z^i+2))) for the i-th objective, where d_z^i is the zooming dimension for the i-th objective, with i in {1,2,...,m}. This bound matches the lower bound of the single objective Lipschitz bandit problem in terms of T, indicating that our algorithm is almost optimal. Numerical experiments confirm the effectiveness of our algorithm.
9

Sharaf, Amr, e Hal Daumé III. "Meta-Learning Effective Exploration Strategies for Contextual Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 11 (18 maggio 2021): 9541–48. http://dx.doi.org/10.1609/aaai.v35i11.17149.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In contextual bandits, an algorithm must choose actions given ob- served contexts, learning from a reward signal that is observed only for the action chosen. This leads to an exploration/exploitation trade-off: the algorithm must balance taking actions it already believes are good with taking new actions to potentially discover better choices. We develop a meta-learning algorithm, Mêlée, that learns an exploration policy based on simulated, synthetic con- textual bandit tasks. Mêlée uses imitation learning against these simulations to train an exploration policy that can be applied to true contextual bandit tasks at test time. We evaluate Mêlée on both a natural contextual bandit problem derived from a learning to rank dataset as well as hundreds of simulated contextual ban- dit problems derived from classification tasks. Mêlée outperforms seven strong baselines on most of these datasets by leveraging a rich feature representation for learning an exploration strategy.
10

Nobari, Sadegh. "DBA: Dynamic Multi-Armed Bandit Algorithm". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 9869–70. http://dx.doi.org/10.1609/aaai.v33i01.33019869.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We introduce Dynamic Bandit Algorithm (DBA), a practical solution to improve the shortcoming of the pervasively employed reinforcement learning algorithm called Multi-Arm Bandit, aka Bandit. Bandit makes real-time decisions based on the prior observations. However, Bandit is heavily biased to the priors that it cannot quickly adapt itself to a trend that is interchanging. As a result, Bandit cannot, quickly enough, make profitable decisions when the trend is changing. Unlike Bandit, DBA focuses on quickly adapting itself to detect these trends early enough. Furthermore, DBA remains as almost as light as Bandit in terms of computations. Therefore, DBA can be easily deployed in production as a light process similar to The Bandit. We demonstrate how critical and beneficial is the main focus of DBA, i.e. the ability to quickly finding the most profitable option in real-time, over its stateof-the-art competitors. Our experiments are augmented with a visualization mechanism that explains the profitability of the decisions made by each algorithm in each step by animations. Finally we observe that DBA can substantially outperform the original Bandit by close to 3 times for a set Key Performance Indicator (KPI) in a case of having 3 arms.

Tesi sul tema "Bandit algorithm":

1

Saadane, Sofiane. "Algorithmes stochastiques pour l'apprentissage, l'optimisation et l'approximation du régime stationnaire". Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30203/document.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Dans cette thèse, nous étudions des thématiques autour des algorithmes stochastiques et c'est pour cette raison que nous débuterons ce manuscrit par des éléments généraux sur ces algorithmes en donnant des résultats historiques pour poser les bases de nos travaux. Ensuite, nous étudierons un algorithme de bandit issu des travaux de N arendra et Shapiro dont l'objectif est de déterminer parmi un choix de plusieurs sources laquelle profite le plus à l'utilisateur en évitant toutefois de passer trop de temps à tester celles qui sont moins per­formantes. Notre but est dans un premier temps de comprendre les faiblesses structurelles de cet algorithme pour ensuite proposer une procédure optimale pour une quantité qui mesure les performances d'un algorithme de bandit, le regret. Dans nos résultats, nous proposerons un algorithme appelé NS sur-pénalisé qui permet d'obtenir une borne de regret optimale au sens minimax au travers d'une étude fine de l'algorithme stochastique sous-jacent à cette procédure. Un second travail sera de donner des vitesses de convergence pour le processus apparaissant dans l'étude de la convergence en loi de l'algorithme NS sur-pénalisé. La par­ticularité de l'algorithme est qu'il ne converge pas en loi vers une diffusion comme la plupart des algorithmes stochastiques mais vers un processus à sauts non-diffusif ce qui rend l'étude de la convergence à l'équilibre plus technique. Nous emploierons une technique de couplage afin d'étudier cette convergence. Le second travail de cette thèse s'inscrit dans le cadre de l'optimisation d'une fonc­tion au moyen d'un algorithme stochastique. Nous étudierons une version stochastique de l'algorithme déterministe de boule pesante avec amortissement. La particularité de cet al­gorithme est d'être articulé autour d'une dynamique qui utilise une moyennisation sur tout le passé de sa trajectoire. La procédure fait appelle à une fonction dite de mémoire qui, selon les formes qu'elle prend, offre des comportements intéressants. Dans notre étude, nous verrons que deux types de mémoire sont pertinents : les mémoires exponentielles et poly­nomiales. Nous établirons pour commencer des résultats de convergence dans le cas général où la fonction à minimiser est non-convexe. Dans le cas de fonctions fortement convexes, nous obtenons des vitesses de convergence optimales en un sens que nous définirons. En­fin, l'étude se termine par un résultat de convergence en loi du processus après une bonne renormalisation. La troisième partie s'articule autour des algorithmes de McKean-Vlasov qui furent intro­duit par Anatoly Vlasov et étudié, pour la première fois, par Henry McKean dans l'optique de la modélisation de la loi de distribution du plasma. Notre objectif est de proposer un al­gorithme stochastique capable d'approcher la mesure invariante du processus. Les méthodes pour approcher une mesure invariante sont connues dans le cas des diffusions et de certains autre processus mais ici la particularité du processus de McKean-Vlasov est de ne pas être une diffusion linéaire. En effet, le processus a de la mémoire comme les processus de boule pesante. De ce fait, il nous faudra développer une méthode alternative pour contourner ce problème. Nous aurons besoin d'introduire la notion de pseudo-trajectoires afin de proposer une procédure efficace
In this thesis, we are studying severa! stochastic algorithms with different purposes and this is why we will start this manuscript by giving historicals results to define the framework of our work. Then, we will study a bandit algorithm due to the work of Narendra and Shapiro whose objectif was to determine among a choice of severa! sources which one is the most profitable without spending too much times on the wrong orres. Our goal is to understand the weakness of this algorithm in order to propose an optimal procedure for a quantity measuring the performance of a bandit algorithm, the regret. In our results, we will propose an algorithm called NS over-penalized which allows to obtain a minimax regret bound. A second work will be to understand the convergence in law of this process. The particularity of the algorith is that it converges in law toward a non-diffusive process which makes the study more intricate than the standard case. We will use coupling techniques to study this process and propose rates of convergence. The second work of this thesis falls in the scope of optimization of a function using a stochastic algorithm. We will study a stochastic version of the so-called heavy bali method with friction. The particularity of the algorithm is that its dynamics is based on the ali past of the trajectory. The procedure relies on a memory term which dictates the behavior of the procedure by the form it takes. In our framework, two types of memory will investigated : polynomial and exponential. We will start with general convergence results in the non-convex case. In the case of strongly convex functions, we will provide upper-bounds for the rate of convergence. Finally, a convergence in law result is given in the case of exponential memory. The third part is about the McKean-Vlasov equations which were first introduced by Anatoly Vlasov and first studied by Henry McKean in order to mode! the distribution function of plasma. Our objective is to propose a stochastic algorithm to approach the invariant distribution of the McKean Vlasov equation. Methods in the case of diffusion processes (and sorne more general pro cesses) are known but the particularity of McKean Vlasov process is that it is strongly non-linear. Thus, we will have to develop an alternative approach. We will introduce the notion of asymptotic pseudotrajectory in odrer to get an efficient procedure
2

Zhong, Hongliang. "Bandit feedback in Classification and Multi-objective Optimization". Thesis, Ecole centrale de Marseille, 2016. http://www.theses.fr/2016ECDM0004/document.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Des problèmes de Bandit constituent une séquence d’allocation dynamique. D’une part, l’agent de système doit explorer son environnement ( à savoir des bras de machine) pour recueillir des informations; d’autre part, il doit exploiter les informations collectées pour augmenter la récompense. Comment d’équilibrer adéquatement la phase d’exploration et la phase d’exploitation, c’est une obscurité des problèmes de Bandit, et la plupart des chercheurs se concentrent des efforts sur les stratégies d’équilibration entre l’exploration et l’exploitation. Dans cette dissertation, nous nous concentrons sur l’étude de deux problèmes spécifiques de Bandit: les problèmes de Bandit contextuel et les problèmes de Bandit Multi- objectives. Cette dissertation propose deux aspects de contributions. La première concerne la classification sous la surveillance partielle, laquelle nous codons comme le problème de Bandit contextuel avec des informations partielles. Ce type des problèmes est abondamment étudié par des chercheurs, en appliquant aux réseaux sociaux ou systèmes de recommandation. Nous proposons une série d’algorithmes sur la base d’algorithme Passive-Aggressive pour résoudre des problèmes de Bandit contextuel. Nous profitons de sa fondations, et montrons que nos algorithmes sont plus simples à mettre en œuvre que les algorithmes en état de l’art. Ils réalisent des biens performances de classification. Pour des problèmes de Bandit Multi-objective (MOMAB), nous proposons une méthode motivée efficace et théoriquement à identifier le front de Pareto entre des bras. En particulier, nous montrons que nous pouvons trouver tous les éléments du front de Pareto avec un budget minimal dans le cadre de PAC borne
Bandit problems constitute a sequential dynamic allocation problem. The pulling agent has to explore its environment (i.e. the arms) to gather information on the one hand, and it has to exploit the collected clues to increase its rewards on the other hand. How to adequately balance the exploration phase and the exploitation phase is the crux of bandit problems and most of the efforts devoted by the research community from this fields has focused on finding the right exploitation/exploration tradeoff. In this dissertation, we focus on investigating two specific bandit problems: the contextual bandit problems and the multi-objective bandit problems. This dissertation provides two contributions. The first contribution is about the classification under partial supervision, which we encode as a contextual bandit problem with side informa- tion. This kind of problem is heavily studied by researchers working on social networks and recommendation systems. We provide a series of algorithms to solve the Bandit feedback problem that pertain to the Passive-Aggressive family of algorithms. We take advantage of its grounded foundations and we are able to show that our algorithms are much simpler to implement than state-of-the-art algorithms for bandit with partial feedback, and they yet achieve better perfor- mances of classification. For multi-objective multi-armed bandit problem (MOMAB), we propose an effective and theoretically motivated method to identify the Pareto front of arms. We in particular show that we can find all elements of the Pareto front with a minimal budget
3

Faury, Louis. "Variance-sensitive confidence intervals for parametric and offline bandits". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT046.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Cette thèse présente des contributions récentes au problème d’optimisation sous feedback bandit, au travers de la construction d’intervalles de confiance sensibles à la variance. Nous traitons deux aspects distincts du problème: (1) la minimisation du regret pour les bandits à modèle linéaire généralisé (GLBs), une large classe de bandits paramétriques non-linéaires et (2) le problème d’optimisation de politique hors ligne sous signal bandit. Concernant (1) nous étudions les effets de la non-linéarité dans les GLBs et remettons en question la compréhension actuelle selon laquelle des hauts niveaux de non-linéarité ne peuvent être que préjudiciables à l’équilibre exploration-exploitation. Des algorithmes améliorés suivis d’une nouvelle méthode d’analyse montrent que lorsque correctement manipulé, le problème de minimisation du regret dans les GLBs n’est pas nécessairement plus dur que pour leur contrepartie linéaire. Il peut même être significativement facilité pour certains membres importants de la famille GLB comme le bandit logistique. Notre approche utilise de nouveaux ensembles de confiance sensibles à la non-linéarité au travers de la variance qu’elle impose à la fonction récompense, accompagnés d’un traitement local de la non-linéarité au travers d’une analyse dite auto-concordante. Concernant (2) nous utilisons des résultats de la littérature de l’optimisation robuste afin de construire des intervalles de confiance asymptotiques sensibles à la variance pour l’évaluation contrefactuelle de politiques. Cela permet d’assurer du conservatisme (désirable pour des agents averses au risque) lors de la recherche hors-ligne de politiques prometteuses. Cet intervalle de confiance engendre de nouveaux objectifs contrefactuels qui sont plus adaptés à des applications pratiques, car convexes et de nature composites
In this dissertation we present recent contributions to the problem of optimization under bandit feedback through the design of variance-sensitive confidence intervals. We tackle two distincts topics: (1) the regret minimization task in Generalized Linear Bandits (GLBs), a broad class of non-linear parametric bandits and (2) the problem of off-line policy optimization under bandit feedback. For (1) we study the effects of non-linearity in GLBs and challenge the current understanding that a high level of non-linearity is detrimental to the exploration-exploitation trade-off. We introduce improved algorithms as well as a novel analysis that prove that if correctly handled, the regret minimization task in GLBs is not necessarily harder than for their linear counterparts. It can even be easier for some important members of the GLB family such as the Logistic Bandit. Our approach leverages a new confidence set which captures the non-linearity of the reward signal through its variance, along with a local treatment of the non-linearity through a so-called self-concordance analysis. For (2) we leverage results from the distributionally robust optimization framework to construct asymptotic variance-sensitive confidence intervals for the counterfactual evaluation of policies. This allows to ensure conservatism (sought out by risk-averse agents) while searching off-line for promising policies. Our confidence intervals lead to new counterfactual objectives which, contrary to their predecessors, are more suited for practical deployment thanks to their convex and composite natures
4

Dorard, L. R. M. "Bandit algorithms for searching large spaces". Thesis, University College London (University of London), 2012. http://discovery.ucl.ac.uk/1348319/.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Bandit games consist of single-state environments in which an agent must sequentially choose actions to take, for which rewards are given. The objective being to maximise the cumulated reward, the agent naturally seeks to build a model of the relationship between actions and rewards. The agent must both choose uncertain actions in order to improve its model (exploration), and actions that are believed to yield high rewards according to the model (exploitation). The choice of an action to take is called a play of an arm of the bandit, and the total number of plays may or may not be known in advance. Algorithms designed to handle the exploration-exploitation dilemma were initially motivated by problems with rather small numbers of actions. But the ideas they were based on have been extended to cases where the number of actions to choose from is much larger than the maximum possible number of plays. Several problems fall into this setting, such as information retrieval with relevance feedback, where the system must learn what a user is looking for while serving relevant documents often enough, but also global optimisation, where the search for an optimum is done by selecting where to acquire potentially expensive samples of a target function. All have in common the search of large spaces. In this thesis, we focus on an algorithm based on the Gaussian Processes probabilistic model, often used in Bayesian optimisation, and the Upper Confidence Bound action-selection heuristic that is popular in bandit algorithms. In addition to demonstrating the advantages of the GP-UCB algorithm on an image retrieval problem, we show how it can be adapted in order to search tree-structured spaces. We provide an efficient implementation, theoretical guarantees on the algorithm's performance, and empirical evidence that it handles large branching factors better than previous bandit-based algorithms, on synthetic trees.
5

Jedor, Matthieu. "Bandit algorithms for recommender system optimization". Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM027.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Dans cette thèse de doctorat, nous étudions l'optimisation des systèmes de recommandation dans le but de fournir des suggestions de produits plus raffinées pour un utilisateur.La tâche est modélisée à l'aide du cadre des bandits multi-bras.Dans une première partie, nous abordons deux problèmes qui se posent fréquemment dans les systèmes de recommandation : le grand nombre d'éléments à traiter et la gestion des contenus sponsorisés.Dans une deuxième partie, nous étudions les performances empiriques des algorithmes de bandit et en particulier comment paramétrer les algorithmes traditionnels pour améliorer les résultats dans les environnements stationnaires et non stationnaires qui l'on rencontre en pratique.Cela nous amène à analyser à la fois théoriquement et empiriquement l'algorithme glouton qui, dans certains cas, est plus performant que l'état de l'art
In this PhD thesis, we study the optimization of recommender systems with the objective of providing more refined suggestions of items for a user to benefit.The task is modeled using the multi-armed bandit framework.In a first part, we look upon two problems that commonly occured in recommendation systems: the large number of items to handle and the management of sponsored contents.In a second part, we investigate the empirical performance of bandit algorithms and especially how to tune conventional algorithm to improve results in stationary and non-stationary environments that arise in practice.This leads us to analyze both theoretically and empirically the greedy algorithm that, in some cases, outperforms the state-of-the-art
6

Besson, Lilian. "Multi-Players Bandit Algorithms for Internet of Things Networks". Thesis, CentraleSupélec, 2019. http://www.theses.fr/2019CSUP0005.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Dans cette thèse de doctorat, nous étudions les réseaux sans fil et les appareils reconfigurables qui peuvent accéder à des réseaux de type radio intelligente, dans des bandes non licenciées et sans supervision centrale. Nous considérons notamment des réseaux actuels ou futurs de l’Internet des Objets (IoT), avec l’objectif d’augmenter la durée de vie de la batterie des appareils, en les équipant d’algorithmes d’apprentissage machine peu coûteux mais efficaces, qui leur permettent d’améliorer automatiquement l’efficacité de leurs communications sans fil. Nous proposons deux modèles de réseaux IoT, et nous montrons empiriquement, par des simulations numériques et une validation expérimentale réaliste, le gain que peuvent apporter nos méthodes, qui se reposent sur l’apprentissage par renforcement. Les différents problèmes d’accès au réseau sont modélisés avec des Bandits Multi-Bras (MAB), mais l’analyse de la convergence d’un grand nombre d’appareils jouant à un jeu collaboratif sans communication ni aucune coordination reste délicate, lorsque les appareils suivent tous un modèle d’activation aléatoire. Le reste de ce manuscrit étudie donc deux modèles restreints, d’abord des banditsmulti-joueurs dans des problèmes stationnaires, puis des bandits mono-joueur non stationnaires. Nous détaillons également une autre contribution, la bibliothèque Python open-source SMPyBandits, qui permet des simulations numériques de problèmes MAB, qui couvre les modèles étudiés et d’autres
In this PhD thesis, we study wireless networks and reconfigurable end-devices that can access Cognitive Radio networks, in unlicensed bands and without central control. We focus on Internet of Things networks (IoT), with the objective of extending the devices’ battery life, by equipping them with low-cost but efficient machine learning algorithms, in order to let them automatically improve the efficiency of their wireless communications. We propose different models of IoT networks, and we show empirically on both numerical simulations and real-world validation the possible gain of our methods, that use Reinforcement Learning. The different network access problems are modeled as Multi-Armed Bandits (MAB), but we found that analyzing the realistic models was intractable, because proving the convergence of many IoT devices playing a collaborative game, without communication nor coordination is hard, when they all follow random activation patterns. The rest of this manuscript thus studies two restricted models, first multi-players bandits in stationary problems, then non-stationary single-player bandits. We also detail another contribution, SMPyBandits, our open-source Python library for numerical MAB simulations, that covers all the studied models and more
7

Deffayet, Romain. "Bandit Algorithms for Adaptive Modulation and Coding in Wireless Networks". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281884.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The demand for quality cellular network coverage has been increasing significantly in the recent years and will continue its progression throughout the near future. This results from an increase of transmitted data, because of new use cases (HD videos, live streaming, online games, ...), but also from a diversification of the traffic, notably because of shorter and more frequent transmissions which can be due to IOT devices or other telemetry applications. The cellular networks are becoming increasingly complex, and the need for better management of the network’s properties is higher than ever. The combined effect of these two paradigms creates a trade-off : whereas one would like to design algorithms that achieve high performance decision-making, one would also like those to be able to do so in any settings that can be encountered in this complex network. Instead, this thesis proposes to restrict the scope of the decision-making algorithms through on-line learning. The thesis focuses on the context of initial MCS selection in Adaptive Modulation and Coding, in which one must choose an initial transmission rate guaranteeing fast communications and low error rate. We formulate the problem as a Reinforcement Learning problem, and propose relevant restrictions to simpler frameworks like Multi-Armed Bandits and Contextual Bandits. Eight bandit algorithms are tested and reviewed with emphasis on practical applications. The thesis shows that a Reinforcement Learning agent can improve the utilization of the link capacity between the transmitter and the receiver. First, we present a cell-wide Multi-Armed Bandit agent, which learns the optimal initial offset in a given cell, and then a contextual augmentation of this agent taking user-specific features as input. The proposed method achieves with burst traffic an 8% increase of the median throughput and 65% reduction of the median regret in the first 0:5s of transmission, when compared to a fixed baseline.
Efterfrågan på mobilnät av hög kvalitet har ökat mycket de senaste åren och kommer att fortsätta öka under en nära framtid. Detta är resultatet av en ökad mängd trafik på grund av nya användningsfall (HD-videor, live streaming, onlinespel, ...) men kommer också från en diversifiering av trafiken, i synnerhet på grund av kortare och mer frekventa sändningar vilka kan vara på grund av IOT-enheter eller andra telemetri-applikationer. Mobilnätet blir allt komplexare och behovet av bättre hantering av nätverkets egenskaper är högre än någonsin. Den kombinerade effekten av dessa två paradigmer skapar en avvägning: medan man vill utforma algoritmer som uppnår mycket hög prestanda vid beslutsfattning, skulle man också vilja att algoritmerna kan göra det i alla konfigurationer som kan uppstå i detta komplexa nätverk. Istället föreslår denna avhandling att begränsa omfattningen av beslutsalgoritmerna genom att introducera online-inlärning. Avhandlingen fokuserar på första MCS-valet i Adaptiv Modulering och Kodning, där man måste välja en initial överföringshastighet som garanterar snabb kommunikation och minsta möjliga transmissionsfel. Vi formulerar problemet som ett Reinforcement Learning problem och föreslår relevanta begränsningar för matematikt enklare ramverk som Multi-Armed Bandits och Contextual Bandits. Åtta banditalgoritmer testas och granskas med hänsyn till praktisk tillämpning. Avhandlingen visar att en Reinforcement Learning agent kan förbättra användningen av länkkapaciteten mellan sändare och mottagare. Först presenterar vi en Multi-Armed Bandit agent på cell-nivå, som lär sig den optimala initiala MCSen i en given cell och sedan en kontextuell utvidgning av dennaa agent med användarspecifika funktioner. Den föreslagna metoden uppnår en åttaprocentig (8%) ökning av medianhastigheten och en sextiofemprocentig (65%) minskning av median ångern vid skurvis trafik det första 0.5s av tranmissionen, jämfört med ett fast referensvärde.
8

Degenne, Rémy. "Impact of structure on the design and analysis of bandit algorithms". Thesis, Université de Paris (2019-....), 2019. http://www.theses.fr/2019UNIP7179.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Cette thèse porte sur des problèmes d'apprentissage statistique séquentiel, dits bandits stochastiques à plusieurs bras. Dans un premier temps un algorithme de bandit est présenté. L'analyse de cet algorithme, comme la majorité des preuves usuelles de bornes de regret pour algorithmes de bandits, utilise des intervalles de confiance pour les moyennes des bras. Dans un cadre paramétrique,on prouve des inégalités de concentration quantifiant la déviation entre le paramètre d'une distribution et son estimation empirique, afin d'obtenir de tels intervalles. Ces inégalités sont exprimées en fonction de la divergence de Kullback-Leibler. Trois extensions du problème de bandits sont ensuite étudiées. Premièrement on considère le problème dit de semi-bandit combinatoire, dans lequel un algorithme choisit un ensemble de bras et la récompense de chaque bras est observée. Le regret minimal atteignable dépend alors de la corrélation entre les bras. On considère ensuite un cadre où on change le mécanisme d'obtention des observations provenant des différents bras. Une source de difficulté du problème de bandits est la rareté de l'information: seul le bras choisi est observé. On montre comment on peut tirer parti de la disponibilité d'observations supplémentaires gratuites, ne participant pas au regret. Enfin, une nouvelle famille d'algorithmes est présentée afin d'obtenir à la fois des guaranties de minimisation de regret et d'identification du meilleur bras. Chacun des algorithmes réalise un compromis entre regret et temps d'identification. On se penche dans un deuxième temps sur le problème dit d'exploration pure, dans lequel un algorithme n'est pas évalué par son regret mais par sa probabilité d'erreur quant à la réponse à une question posée sur le problème. On détermine la complexité de tels problèmes et on met au point des algorithmes approchant cette complexité
In this Thesis, we study sequential learning problems called stochastic multi-armed bandits. First a new bandit algorithm is presented. The analysis of that algorithm uses confidence intervals on the mean of the arms reward distributions, as most bandit proofs do. In a parametric setting, we derive concentration inequalities which quantify the deviation between the mean parameter of a distribution and its empirical estimation in order to obtain confidence intervals. These inequalities are presented as bounds on the Kullback-Leibler divergence. Three extensions of the stochastic multi-armed bandit problem are then studied. First we study the so-called combinatorial semi-bandit problem, in which an algorithm chooses a set of arms and the reward of each of these arms is observed. The minimal attainable regret then depends on the correlation between the arm distributions. We consider then a setting in which the observation mechanism changes. One source of difficulty of the bandit problem is the scarcity of information: only the arm pulled is observed. We show how to use efficiently eventual supplementary free information (which do not influence the regret). Finally a new family of algorithms is introduced to obtain both regret minimization and est arm identification regret guarantees. Each algorithm of the family realizes a trade-off between regret and time needed to identify the best arm. In a second part we study the so-called pure exploration problem, in which an algorithm is not evaluated on its regret but on the probability that it returns a wrong answer to a question on the arm distributions. We determine the complexity of such problems and design with performance close to that complexity
9

Nicol, Olivier. "Data-driven evaluation of contextual bandit algorithms and applications to dynamic recommendation". Thesis, Lille 1, 2014. http://www.theses.fr/2014LIL10211/document.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Ce travail de thèse a été réalisé dans le contexte de la recommandation dynamique. La recommandation est l'action de fournir du contenu personnalisé à un utilisateur utilisant une application, dans le but d'améliorer son utilisation e.g. la recommandation d'un produit sur un site marchant ou d'un article sur un blog. La recommandation est considérée comme dynamique lorsque le contenu à recommander ou encore les goûts des utilisateurs évoluent rapidement e.g. la recommandation d'actualités. Beaucoup d'applications auxquelles nous nous intéressons génèrent d'énormes quantités de données grâce à leurs millions d'utilisateurs sur Internet. Néanmoins, l'utilisation de ces données pour évaluer une nouvelle technique de recommandation ou encore comparer deux algorithmes de recommandation est loin d'être triviale. C'est cette problématique que nous considérons ici. Certaines approches ont déjà été proposées. Néanmoins elles sont très peu étudiées autant théoriquement (biais non quantifié, borne de convergence assez large...) qu'empiriquement (expériences sur données privées). Dans ce travail nous commençons par combler de nombreuses lacunes de l'analyse théorique. Ensuite nous discutons les résultats très surprenants d'une expérience à très grande échelle : une compétition ouverte au public que nous avons organisée. Cette compétition nous a permis de mettre en évidence une source de biais considérable et constamment présente en pratique : l'accélération temporelle. La suite de ce travail s'attaque à ce problème. Nous montrons qu'une approche à base de bootstrap permet de réduire mais surtout de contrôler ce biais
The context of this thesis work is dynamic recommendation. Recommendation is the action, for an intelligent system, to supply a user of an application with personalized content so as to enhance what is refered to as "user experience" e.g. recommending a product on a merchant website or even an article on a blog. Recommendation is considered dynamic when the content to recommend or user tastes evolve rapidly e.g. news recommendation. Many applications that are of interest to us generates a tremendous amount of data through the millions of online users they have. Nevertheless, using this data to evaluate a new recommendation technique or even compare two dynamic recommendation algorithms is far from trivial. This is the problem we consider here. Some approaches have already been proposed. Nonetheless they were not studied very thoroughly both from a theoretical point of view (unquantified bias, loose convergence bounds...) and from an empirical one (experiments on private data only). In this work we start by filling many blanks within the theoretical analysis. Then we comment on the result of an experiment of unprecedented scale in this area: a public challenge we organized. This challenge along with a some complementary experiments revealed a unexpected source of a huge bias: time acceleration. The rest of this work tackles this issue. We show that a bootstrap-based approach allows to significantly reduce this bias and more importantly to control it
10

Claeys, Emmanuelle. "Clusterisation incrémentale, multicritères de données hétérogènes pour la personnalisation d’expérience utilisateur". Thesis, Strasbourg, 2019. http://www.theses.fr/2019STRAD039.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Dans de nombreux domaines (santé, vente en ligne, …) concevoir ex nihilo une solution optimale répondant à un problème défini (trouver un protocole augmentant le taux de guérison, concevoir une page Web favorisant l'achat d'un ou plusieurs produits, ...) est souvent très difficile voire impossible. Face à cette difficulté, les concepteurs (médecins, web designers, ingénieurs de production,...) travaillent souvent de façon incrémentale par des améliorations successives d'une solution existante. Néanmoins, définir les modifications les plus pertinentes reste un problème difficile. Pour tenter d'y répondre, une solution adoptée de plus en plus fréquemment consiste à comparer concrètement différentes alternatives (appelées aussi variations) afin de déterminer celle(s) répondant le mieux au problème via un A/B Test. L'idée est de mettre en oeuvre réellement ces alternatives et de comparer les résultats obtenus, c'est-à-dire les gains respectifs obtenus par chacune des variations. Pour identifier la variation optimale le plus rapidement possible, de nombreuses méthodes de test utilisent une stratégie d'allocation dynamique automatisée. Le principe est d'allouer le plus rapidement possible et automatiquement, les sujets testés à la variation la plus performante, par un apprentissage par renforcement. Parmi les méthodes possibles, il existe en théorie des probabilités les méthodes de bandit manchot. Ces méthodes ont montré leur intérêt en pratique mais également des limites, dont en particulier une temps de latence (c'est-à-dire un délai entre l'arrivée d'un sujet à tester et son allocation) trop important, un déficit d'explicabilité des choix et la non-intégration d’un contexte évolutif décrivant le comportement du sujet avant d’être testé. L'objectif global de cette thèse est de proposer une méthode générique d'A/B test permettant une allocation dynamique en temps réel capable de prendre en compte les caractéristiques des sujets, qu'elles soient temporelles ou non, et interprétable a posteriori
In many activity sectors (health, online sales,...) designing from scratch an optimal solution for a defined problem (finding a protocol to increase the cure rate, designing a web page to promote the purchase of one or more products,...) is often very difficult or even impossible. In order to face this difficulty, designers (doctors, web designers, production engineers,...) often work incrementally by successive improvements of an existing solution. However, defining the most relevant changes remains a difficult problem. Therefore, a solution adopted more and more frequently is to compare constructively different alternatives (also called variations) in order to determine the best one by an A/B Test. The idea is to implement these alternatives and compare the results obtained, i.e. the respective rewards obtained by each variation. To identify the optimal variation in the shortest possible time, many test methods use an automated dynamic allocation strategy. Its allocate the tested subjects quickly and automatically to the most efficient variation, through a learning reinforcement algorithms (as one-armed bandit methods). These methods have shown their interest in practice but also limitations, including in particular a latency time (i.e. a delay between the arrival of a subject to be tested and its allocation) too long, a lack of explicitness of choices and the integration of an evolving context describing the subject's behaviour before being tested. The overall objective of this thesis is to propose a understable generic A/B test method allowing a dynamic real-time allocation which take into account the temporals static subjects’s characteristics

Libri sul tema "Bandit algorithm":

1

Braun, Kathrin, e Cordula Kropp, a cura di. In digitaler Gesellschaft. Bielefeld, Germany: transcript Verlag, 2021. http://dx.doi.org/10.14361/9783839454534.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Wie verändern sich gesellschaftliche Praktiken und die Chancen demokratischer Technikgestaltung, wenn neben Bürger*innen und Öffentlichkeit auch Roboter, Algorithmen, Simulationen oder selbstlernende Systeme einbezogen und als Beteiligte ernstgenommen werden? Die Beiträger*innen des Bandes untersuchen die Neukonfiguration von Verantwortung und Kontrolle, Wissen, Beteiligungsansprüchen und Kooperationsmöglichkeiten im Umgang mit intelligenten Systemen wie smart grids, Servicerobotern, Routenplanern, Finanzmarktalgorithmen und anderen soziodigitalen Arrangements. Aufgezeigt wird, wie die digitalen »Neulinge« dazu beitragen, die Gestaltungsmöglichkeiten für Demokratie, Inklusion und Nachhaltigkeit zu verändern und Macht- und Kraftverhältnisse zu verschieben.
2

Block, Katharina, Anne Deremetz, Anna Henkel e Malte Rehbein, a cura di. 10 Minuten Soziologie: Digitalisierung. Bielefeld, Germany: transcript Verlag, 2022. http://dx.doi.org/10.14361/9783839457108.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Vom Algorithmus bis zum Sensor umfasst die Digitalisierung eine Vielfalt technologischer Innovationen. Ebenso facettenreich sind die Dimensionen, in denen sie die Gesellschaft transformiert und gleichzeitig von ihr geprägt wird. Die Auswirkungen auf die Kommunikation im öffentlichen Raum, auf die Wissenschaft und Landwirtschaft sowie die Wechselwirkungen mit dem Recht, der Wirtschaft und der Ökologie - die Beitragenden des Bandes gehen diesen und anderen Aspekten von Digitalisierung aus verschiedenen theoretischen Blickwinkeln nach. Damit eröffnen sie Perspektiven, die Digitalisierung als sozio-technischen Wandel verstehen und erklären lassen.
3

Szepesvári, Csaba, e Tor Lattimore. Bandit Algorithms. Cambridge University Press, 2020.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Lattimore, Tor. Bandit Algorithms. University of Cambridge ESOL Examinations, 2020.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
5

White, John Myles. Bandit Algorithms for Website Optimization. O'Reilly Media, Incorporated, 2012.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Dorota Głowacka. Bandit Algorithms in Information Retrieval. Now Publishers, 2019.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

White, John Myles. Bandit Algorithms for Website Optimization. O'Reilly Media, Incorporated, 2012.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Bandit Algorithms for Website Optimization: Developing, Deploying, and Debugging. O'Reilly Media, 2012.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Verständig, Dan, Christina Kast, Janne Stricker e Andreas Nürnberger, a cura di. Algorithmen und Autonomie. Verlag Barbara Budrich, 2022. http://dx.doi.org/10.3224/84742520.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Wir leben in einer Welt der algorithmischen Sortierung und Entscheidungsfindung. Mathematische Modelle kuratieren unsere sozialen Beziehungen, beeinflussen unsere Wahlen und entscheiden sogar darüber, ob wir ins Gefängnis gehen sollten oder nicht. Aber wie viel wissen wir wirklich über Code, algorithmische Strukturen und deren Wirkweisen? Der Band wendet sich den Fragen der Autonomie im digitalen Zeitalter aus einer interdisziplinären Perspektive zu, indem er Beiträge aus Philosophie, Erziehungs- und Kulturwissenschaft mit der Informatik verbindet.
10

Beyer, Elena, Katharina Erler, Christoph Hartmann, Malte Kramme, Michael F. Müller, Tereza Pertot, Elif Tuna e Felix M. Wilke, a cura di. Privatrecht 2050 - Blick in die digitale Zukunft. Nomos Verlagsgesellschaft mbH & Co. KG, 2020. http://dx.doi.org/10.5771/9783748901723.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Der Band enthält die Beiträge der 30. Jahrestagung der Gesellschaft Junge Zivilrechtswissenschaft e.V., die im September 2019 von Bayreuther Nachwuchswissenschaftlern ausgerichtet wurde. Die Beiträge befassen sich mit den Auswirkungen des digitalen Wandels auf die Entwicklung des Privatrechts in den kommenden Jahrzehnten und behandeln die folgenden Themenfelder: Legal Tech: Grenzen der Personalisierung dispositiven Rechts, Möglichkeiten zur Formalisierung des Rechts und zur automatischen Subsumtion, Rechtsdienstleistungen durch Online-Inkassodienste Vertragsrecht: vertragsrechtliche Erfassung der Plattformwirtschaft, Beteiligung künstlicher Intelligenz in Vertragsverhältnissen sowie Haftungsfragen im Internet der Dinge Sachenrecht: Übertragung von Bitcoins Gesellschaftsrecht: EU Company Law Package und die digitalisierte GmbH, virtuelle Hauptversammlung Prozessrecht: digitale Beweismittel und Smart Enforcement Diskriminierung durch Algorithmen Datenschutzverletzung als Wettbewerbsverstoß <b>Mit Beiträgen von </b> Martin Schmidt-Kessel, Philip Maximilian Bender, Johannes Klug, Sören Segger-Piening, Johannes Warter, Julia Grinzinger, Dimitrios Linardatos, Lena Maute, Miriam Kullmann, Ralf Knaier, Patrick Nutz, Miriam Buiten, Julia Harten, David Markworth, Lukas Klever, Julian Rapp

Capitoli di libri sul tema "Bandit algorithm":

1

Liu, Weiwen, Shuai Li e Shengyu Zhang. "Contextual Dependent Click Bandit Algorithm for Web Recommendation". In Lecture Notes in Computer Science, 39–50. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-94776-1_4.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Neu, Gergely, e Gábor Bartók. "An Efficient Algorithm for Learning with Semi-bandit Feedback". In Lecture Notes in Computer Science, 234–48. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-40935-6_17.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Gagliolo, Matteo, e Jürgen Schmidhuber. "Algorithm Selection as a Bandit Problem with Unbounded Losses". In Lecture Notes in Computer Science, 82–96. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-13800-3_7.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

You, Shuhua, Quan Liu, Qiming Fu, Shan Zhong e Fei Zhu. "A Bayesian Sarsa Learning Algorithm with Bandit-Based Method". In Neural Information Processing, 108–16. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-26532-2_13.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
5

El Mesaoudi-Paul, Adil, Dimitri Weiß, Viktor Bengs, Eyke Hüllermeier e Kevin Tierney. "Pool-Based Realtime Algorithm Configuration: A Preselection Bandit Approach". In Lecture Notes in Computer Science, 216–32. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-53552-0_22.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Larcher, Maxime, Robert Meier e Angelika Steger. "A Simple Optimal Algorithm for the 2-Arm Bandit Problem". In Symposium on Simplicity in Algorithms (SOSA), 365–72. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2023. http://dx.doi.org/10.1137/1.9781611977585.ch33.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Bouneffouf, Djallel, Amel Bouzeghoub e Alda Lopes Gançarski. "A Contextual-Bandit Algorithm for Mobile Context-Aware Recommender System". In Neural Information Processing, 324–31. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-34487-9_40.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Achab, Mastane, Stephan Clémençon, Aurélien Garivier, Anne Sabourin e Claire Vernade. "Max K-Armed Bandit: On the ExtremeHunter Algorithm and Beyond". In Machine Learning and Knowledge Discovery in Databases, 389–404. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-71246-8_24.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Moeini, Mahdi, Oliver Wendt e Linus Krumrey. "Portfolio Optimization by Means of a $$\chi $$ -Armed Bandit Algorithm". In Intelligent Information and Database Systems, 620–29. Berlin, Heidelberg: Springer Berlin Heidelberg, 2016. http://dx.doi.org/10.1007/978-3-662-49390-8_60.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Zhang, Xiaofang, Qian Zhou, Tieke He e Bin Liang. "Con-CNAME: A Contextual Multi-armed Bandit Algorithm for Personalized Recommendations". In Artificial Neural Networks and Machine Learning – ICANN 2018, 326–36. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-01421-6_32.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri

Atti di convegni sul tema "Bandit algorithm":

1

Bouneffouf, Djallel, Irina Rish, Guillermo Cecchi e Raphaël Féraud. "Context Attentive Bandits: Contextual Bandit with Restricted Context". In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/203.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation is motivated by different online problems arising in clinical trials, recommender systems and attention modeling.Herein, we adapt the standard multi-armed bandit algorithm known as Thompson Sampling to take advantage of our restricted context setting, and propose two novel algorithms, called the Thompson Sampling with Restricted Context (TSRC) and the Windows Thompson Sampling with Restricted Context (WTSRC), for handling stationary and nonstationary environments, respectively. Our empirical results demonstrate advantages of the proposed approaches on several real-life datasets.
2

Gao, Ruijiang, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Min Kyung Lee e Matthew Lease. "Human-AI Collaboration with Bandit Feedback". In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/237.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Human-machine complementarity is important when neither the algorithm nor the human yield dominant performance across all instances in a given domain. Most research on algorithmic decision-making solely centers on the algorithm's performance, while recent work that explores human-machine collaboration has framed the decision-making problems as classification tasks. In this paper, we first propose and then develop a solution for a novel human-machine collaboration problem in a bandit feedback setting. Our solution aims to exploit the human-machine complementarity to maximize decision rewards. We then extend our approach to settings with multiple human decision makers. We demonstrate the effectiveness of our proposed methods using both synthetic and real human responses, and find that our methods outperform both the algorithm and the human when they each make decisions on their own. We also show how personalized routing in the presence of multiple human decision-makers can further improve the human-machine team performance.
3

Zhang, Xiaoying, Hong Xie, Hang Li e John C.S. Lui. "Conversational Contextual Bandit: Algorithm and Application". In WWW '20: The Web Conference 2020. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3366423.3380148.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Xie, Miao, Wotao Yin e Huan Xu. "AutoBandit: A Meta Bandit Online Learning System". In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/719.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Recently online multi-armed bandit (MAB) is growing rapidly, as novel problem settings and algorithms motivated by various practical applications are being studied, building on the top of the classic bandit problem. However, identifying the best bandit algorithm from lots of potential candidates for a given application is not only time-consuming but also relying on human expertise, which hinders the practicality of MAB. To alleviate this problem, this paper outlines an intelligent system called AutoBandit, equipped with many out-of-the-box MAB algorithms, for automatically and adaptively choosing the best with suitable hyper-parameters online. It is effective to help a growing application for continuously maximizing cumulative rewards of its whole life-cycle. With a flexible architecture and user-friendly web-based interfaces, it is very convenient for the user to integrate and monitor online bandits in a business system. At the time of publication, AutoBandit has been deployed for various industrial applications.
5

Nie, Keyu, Zezhong Zhang, Ted Tao Yuan, Rong Song e Pauline Berry Burke. "Efficient Multivariate Bandit Algorithm with Path Planning". In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2020. http://dx.doi.org/10.1109/ictai50040.2020.00023.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Peng, Yi, Miao Xie, Jiahao Liu, Xuying Meng, Nan Li, Cheng Yang, Tao Yao e Rong Jin. "A Practical Semi-Parametric Contextual Bandit". In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/450.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Classic multi-armed bandit algorithms are inefficient for a large number of arms. On the other hand, contextual bandit algorithms are more efficient, but they suffer from a large regret due to the bias of reward estimation with finite dimensional features. Although recent studies proposed semi-parametric bandits to overcome these defects, they assume arms' features are constant over time. However, this assumption rarely holds in practice, since real-world problems often involve underlying processes that are dynamically evolving over time especially for the special promotions like Singles' Day sales. In this paper, we formulate a novel Semi-Parametric Contextual Bandit Problem to relax this assumption. For this problem, a novel Two-Steps Upper-Confidence Bound framework, called Semi-Parametric UCB (SPUCB), is presented. It can be flexibly applied to linear parametric function problem with a satisfied gap-free bound on the n-step regret. Moreover, to make our method more practical in online system, an optimization is proposed for dealing with high dimensional features of a linear function. Extensive experiments on synthetic data as well as a real dataset from one of the largest e-commercial platforms demonstrate the superior performance of our algorithm.
7

Yang, Peng, Peilin Zhao e Xin Gao. "Bandit Online Learning on Graphs via Adaptive Optimization". In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/415.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Traditional online learning on graphs adapts graph Laplacian into ridge regression, which may not guarantee reasonable accuracy when the data are adversarially generated. To solve this issue, we exploit an adaptive optimization framework for online classification on graphs. The derived model can achieve a min-max regret under an adversarial mechanism of data generation. To take advantage of the informative labels, we propose an adaptive large-margin update rule, which enjoys a lower regret than the algorithms using error-driven update rules. However, this algorithm assumes that the full information label is provided for each node, which is violated in many practical applications where labeling is expensive and the oracle may only tell whether the prediction is correct or not. To address this issue, we propose a bandit online algorithm on graphs. It derives per-instance confidence region of the prediction, from which the model can be learned adaptively to minimize the online regret. Experiments on benchmark graph datasets show that the proposed bandit algorithm outperforms state-of-the-art competitors, even sometimes beats the algorithms using full information label feedback.
8

Ou, Mingdong, Nan Li, Shenghuo Zhu e Rong Jin. "Multinomial Logit Bandit with Linear Utility Functions". In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/361.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Multinomial logit bandit is a sequential subset selection problem which arises in many applications. In each round, the player selects a K-cardinality subset from N candidate items, and receives a reward which is governed by a multinomial logit (MNL) choice model considering both item utility and substitution property among items. The player's objective is to dynamically learn the parameters of MNL model and maximize cumulative reward over a finite horizon T. This problem faces the exploration-exploitation dilemma, and the involved combinatorial nature makes it non-trivial. In recent years, there have developed some algorithms by exploiting specific characteristics of the MNL model, but all of them estimate the parameters of MNL model separately and incur a regret bound which is not preferred for large candidate set size N. In this paper, we consider the linear utility MNL choice model whose item utilities are represented as linear functions of d-dimension item features, and propose an algorithm, titled LUMB, to exploit the underlying structure. It is proven that the proposed algorithm achieves regret which is free of candidate set size. Experiments show the superiority of the proposed algorithm.
9

Hu, Yi-Qi, Yang Yu e Jun-Da Liao. "Cascaded Algorithm-Selection and Hyper-Parameter Optimization with Extreme-Region Upper Confidence Bound Bandit". In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/351.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
An automatic machine learning (AutoML) task is to select the best algorithm and its hyper-parameters simultaneously. Previously, the hyper-parameters of all algorithms are joint as a single search space, which is not only huge but also redundant, because many dimensions of hyper-parameters are irrelevant with the selected algorithms. In this paper, we propose a cascaded approach for algorithm selection and hyper-parameter optimization. While a search procedure is employed at the level of hyper-parameter optimization, a bandit strategy runs at the level of algorithm selection to allocate the budget based on the search feedbacks. Since the bandit is required to select the algorithm with the maximum performance, instead of the average performance, we thus propose the extreme-region upper confidence bound (ER-UCB) strategy, which focuses on the extreme region of the underlying feedback distribution. We show theoretically that the ER-UCB has a regret upper bound O(K ln n) with independent feedbacks, which is as efficient as the classical UCB bandit. We also conduct experiments on a synthetic problem as well as a set of AutoML tasks. The results verify the effectiveness of the proposed method.
10

Kiribuchi, Daiki, Myungsook Ko e Takeichiro Nishikawa. "Maintenance Interval Adjustment by Applying the Bandit Algorithm". In 2019 IEEE International Conference on Industrial Technology (ICIT). IEEE, 2019. http://dx.doi.org/10.1109/icit.2019.8755002.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri

Rapporti di organizzazioni sul tema "Bandit algorithm":

1

Marty, Frédéric, e Thierry Warin. Deciphering Algorithmic Collusion: Insights from Bandit Algorithms and Implications for Antitrust Enforcement. CIRANO, dicembre 2023. http://dx.doi.org/10.54932/iwpg7510.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This paper examines algorithmic collusion from legal and economic perspectives, highlighting the growing role of algorithms in digital markets and their potential for anti-competitive behavior. Using bandit algorithms as a model, traditionally applied in uncertain decision-making contexts, we illuminate the dynamics of implicit collusion without overt communication. Legally, the challenge is discerning and classifying these algorithmic signals, especially as unilateral communications. Economically, distinguishing between rational pricing and collusive patterns becomes intricate with algorithm-driven decisions. The paper emphasizes the imperative for competition authorities to identify unusual market behaviors, hinting at shifting the burden of proof to firms with algorithmic pricing. Balancing algorithmic transparency and collusion prevention is crucial. While regulations might address these concerns, they could hinder algorithmic development. As this form of collusion becomes central in antitrust, understanding through models like bandit algorithms is vital, since these last ones may converge faster towards an anticompetitive equilibrium.
2

Johansen, Richard A., Christina L. Saltus, Molly K. Reif e Kaytee L. Pokrzywinski. A Review of Empirical Algorithms for the Detection and Quantification of Harmful Algal Blooms Using Satellite-Borne Remote Sensing. U.S. Army Engineer Research and Development Center, giugno 2022. http://dx.doi.org/10.21079/11681/44523.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Harmful Algal Blooms (HABs) continue to be a global concern, especially since predicting bloom events including the intensity, extent, and geographic location, remain difficult. However, remote sensing platforms are useful tools for monitoring HABs across space and time. The main objective of this review was to explore the scientific literature to develop a near-comprehensive list of spectrally derived empirical algorithms for satellite imagers commonly utilized for the detection and quantification HABs and water quality indicators. This review identified the 29 WorldView-2 MSI algorithms, 25 Sentinel-2 MSI algorithms, 32 Landsat-8 OLI algorithms, 9 MODIS algorithms, and 64 MERIS/Sentinel-3 OLCI algorithms. This review also revealed most empirical-based algorithms fell into one of the following general formulas: two-band difference algorithm (2BDA), three-band difference algorithm (3BDA), normalized-difference chlorophyll index (NDCI), or the cyanobacterial index (CI). New empirical algorithm development appears to be constrained, at least in part, due to the limited number of HAB-associated spectral features detectable in currently operational imagers. However, these algorithms provide a foundation for future algorithm development as new sensors, technologies, and platforms emerge.
3

Alwan, Iktimal, Dennis D. Spencer e Rafeed Alkawadri. Comparison of Machine Learning Algorithms in Sensorimotor Functional Mapping. Progress in Neurobiology, dicembre 2023. http://dx.doi.org/10.60124/j.pneuro.2023.30.03.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Objective: To compare the performance of popular machine learning algorithms (ML) in mapping the sensorimotor cortex (SM) and identifying the anterior lip of the central sulcus (CS). Methods: We evaluated support vector machines (SVMs), random forest (RF), decision trees (DT), single layer perceptron (SLP), and multilayer perceptron (MLP) against standard logistic regression (LR) to identify the SM cortex employing validated features from six-minute of NREM sleep icEEG data and applying standard common hyperparameters and 10-fold cross-validation. Each algorithm was tested using vetted features based on the statistical significance of classical univariate analysis (p<0.05) and extended () 17 features representing power/coherence of different frequency bands, entropy, and interelectrode-based distance. The analysis was performed before and after weight adjustment for imbalanced data (w). Results: 7 subjects and 376 contacts were included. Before optimization, ML algorithms performed comparably employing conventional features (median CS accuracy: 0.89, IQR [0.88-0.9]). After optimization, neural networks outperformed others in means of accuracy (MLP: 0.86), the area under the curve (AUC) (SLPw, MLPw, MLP: 0.91), recall (SLPw: 0.82, MLPw: 0.81), precision (SLPw: 0.84), and F1-scores (SLPw: 0.82). SVM achieved the best specificity performance. Extending the number of features and adjusting the weights improved recall, precision, and F1-scores by 48.27%, 27.15%, and 39.15%, respectively, with gains or no significant losses in specificity and AUC across CS and Function (correlation r=0.71 between the two clinical scenarios in all performance metrics, p<0.001). Interpretation: Computational passive sensorimotor mapping is feasible and reliable. Feature extension and weight adjustments improve the performance and counterbalance the accuracy paradox. Optimized neural networks outperform other ML algorithms even in binary classification tasks. The best-performing models and the MATLAB® routine employed in signal processing are available to the public at (Link 1).
4

Kwong, Man Kam. Sweeping algorithms for five-point stencils and banded matrices. Office of Scientific and Technical Information (OSTI), giugno 1992. http://dx.doi.org/10.2172/10160879.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Kwong, Man Kam. Sweeping algorithms for five-point stencils and banded matrices. Office of Scientific and Technical Information (OSTI), giugno 1992. http://dx.doi.org/10.2172/7276272.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Lumsdaine, A., J. White, D. Webber e A. Sangiovanni-Vincentelli. A Band Relaxation Algorithm for Reliable and Parallelizable Circuit Simulation. Fort Belvoir, VA: Defense Technical Information Center, agosto 1988. http://dx.doi.org/10.21236/ada200783.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Anderson, Gerald L., e Kalman Peleg. Precision Cropping by Remotely Sensed Prorotype Plots and Calibration in the Complex Domain. United States Department of Agriculture, dicembre 2002. http://dx.doi.org/10.32747/2002.7585193.bard.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This research report describes a methodology whereby multi-spectral and hyperspectral imagery from remote sensing, is used for deriving predicted field maps of selected plant growth attributes which are required for precision cropping. A major task in precision cropping is to establish areas of the field that differ from the rest of the field and share a common characteristic. Yield distribution f maps can be prepared by yield monitors, which are available for some harvester types. Other field attributes of interest in precision cropping, e.g. soil properties, leaf Nitrate, biomass etc. are obtained by manual sampling of the filed in a grid pattern. Maps of various field attributes are then prepared from these samples by the "Inverse Distance" interpolation method or by Kriging. An improved interpolation method was developed which is based on minimizing the overall curvature of the resulting map. Such maps are the ground truth reference, used for training the algorithm that generates the predicted field maps from remote sensing imagery. Both the reference and the predicted maps are stratified into "Prototype Plots", e.g. 15xl5 blocks of 2m pixels whereby the block size is 30x30m. This averaging reduces the datasets to manageable size and significantly improves the typically poor repeatability of remote sensing imaging systems. In the first two years of the project we used the Normalized Difference Vegetation Index (NDVI), for generating predicted yield maps of sugar beets and com. The NDVI was computed from image cubes of three spectral bands, generated by an optically filtered three camera video imaging system. A two dimensional FFT based regression model Y=f(X), was used wherein Y was the reference map and X=NDVI was the predictor. The FFT regression method applies the "Wavelet Based", "Pixel Block" and "Image Rotation" transforms to the reference and remote images, prior to the Fast - Fourier Transform (FFT) Regression method with the "Phase Lock" option. A complex domain based map Yfft is derived by least squares minimization between the amplitude matrices of X and Y, via the 2D FFT. For one time predictions, the phase matrix of Y is combined with the amplitude matrix ofYfft, whereby an improved predicted map Yplock is formed. Usually, the residuals of Y plock versus Y are about half of the values of Yfft versus Y. For long term predictions, the phase matrix of a "field mask" is combined with the amplitude matrices of the reference image Y and the predicted image Yfft. The field mask is a binary image of a pre-selected region of interest in X and Y. The resultant maps Ypref and Ypred aremodified versions of Y and Yfft respectively. The residuals of Ypred versus Ypref are even lower than the residuals of Yplock versus Y. The maps, Ypref and Ypred represent a close consensus of two independent imaging methods which "view" the same target. In the last two years of the project our remote sensing capability was expanded by addition of a CASI II airborne hyperspectral imaging system and an ASD hyperspectral radiometer. Unfortunately, the cross-noice and poor repeatability problem we had in multi-spectral imaging was exasperated in hyperspectral imaging. We have been able to overcome this problem by over-flying each field twice in rapid succession and developing the Repeatability Index (RI). The RI quantifies the repeatability of each spectral band in the hyperspectral image cube. Thereby, it is possible to select the bands of higher repeatability for inclusion in the prediction model while bands of low repeatability are excluded. Further segregation of high and low repeatability bands takes place in the prediction model algorithm, which is based on a combination of a "Genetic Algorithm" and Partial Least Squares", (PLS-GA). In summary, modus operandi was developed, for deriving important plant growth attribute maps (yield, leaf nitrate, biomass and sugar percent in beets), from remote sensing imagery, with sufficient accuracy for precision cropping applications. This achievement is remarkable, given the inherently high cross-noice between the reference and remote imagery as well as the highly non-repeatable nature of remote sensing systems. The above methodologies may be readily adopted by commercial companies, which specialize in proving remotely sensed data to farmers.
8

Borges, Carlos F., e Craig S. Peters. An Algorithm for Computing the Stationary Distribution of a Discrete-Time Birth-and-Death Process with Banded Infinitesimal Generator. Fort Belvoir, VA: Defense Technical Information Center, aprile 1995. http://dx.doi.org/10.21236/ada295810.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Terrill, Eric J. X-band Observations of Waves, Algorithm Development, and Validation High Resolution Wave-Air-Sea Interaction DRI. Fort Belvoir, VA: Defense Technical Information Center, settembre 2012. http://dx.doi.org/10.21236/ada574656.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Chen, Z., S. E. Grasby, C. Deblonde e X. Liu. AI-enabled remote sensing data interpretation for geothermal resource evaluation as applied to the Mount Meager geothermal prospective area. Natural Resources Canada/CMSS/Information Management, 2022. http://dx.doi.org/10.4095/330008.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The objective of this study is to search for features and indicators from the identified geothermal resource sweet spot in the south Mount Meager area that are applicable to other volcanic complexes in the Garibaldi Volcanic Belt. A Landsat 8 multi-spectral band dataset, for a total of 57 images ranging from visible through infrared to thermal infrared frequency channels and covering different years and seasons, were selected. Specific features that are indicative of high geothermal heat flux, fractured permeable zones, and groundwater circulation, the three key elements in exploring for geothermal resource, were extracted. The thermal infrared images from different seasons show occurrence of high temperature anomalies and their association with volcanic and intrusive bodies, and reveal the variation in location and intensity of the anomalies with time over four seasons, allowing inference of specific heat transform mechanisms. Automatically extracted linear features using AI/ML algorithms developed for computer vision from various frequency bands show various linear segment groups that are likely surface expression associated with local volcanic activities, regional deformation and slope failure. In conjunction with regional structural models and field observations, the anomalies and features from remotely sensed images were interpreted to provide new insights for improving our understanding of the Mount Meager geothermal system and its characteristics. After validation, the methods developed and indicators identified in this study can be applied to other volcanic complexes in the Garibaldi, or other volcanic belts for geothermal resource reconnaissance.

Vai alla bibliografia