Gotowe bibliografie tematyczne / Algorithme de bandit

Gotowa bibliografia na temat „Algorithme de bandit”

Autor: Grafiati

Data publikacji: 1 czerwca 2024

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Spis treści

Artykuły w czasopismach
Rozprawy doktorskie
Książki
Części książek
Streszczenia konferencji
Raporty organizacyjne

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Algorithme de bandit”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Algorithme de bandit"

Ciucanu, Radu, Pascal Lafourcade, Gael Marcadet i Marta Soare. "SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits". Journal of Artificial Intelligence Research 73 (23.02.2022): 737–65. http://dx.doi.org/10.1613/jair.1.13163.

Pełny tekst źródła

Streszczenie:

The multi-armed bandit is a reinforcement learning model where a learning agent repeatedly chooses an action (pull a bandit arm) and the environment responds with a stochastic outcome (reward) coming from an unknown distribution associated with the chosen arm. Bandits have a wide-range of application such as Web recommendation systems. We address the cumulative reward maximization problem in a secure federated learning setting, where multiple data owners keep their data stored locally and collaborate under the coordination of a central orchestration server. We rely on cryptographic schemes and propose Samba, a generic framework for Secure federAted Multi-armed BAndits. Each data owner has data associated to a bandit arm and the bandit algorithm has to sequentially select which data owner is solicited at each time step. We instantiate Samba for five bandit algorithms. We show that Samba returns the same cumulative reward as the nonsecure versions of bandit algorithms, while satisfying formally proven security properties. We also show that the overhead due to cryptographic primitives is linear in the size of the input, which is confirmed by our proof-of-concept implementation.

Style APA, Harvard, Vancouver, ISO itp.

Azizi, Javad, Branislav Kveton, Mohammad Ghavamzadeh i Sumeet Katariya. "Meta-Learning for Simple Regret Minimization". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 6 (26.06.2023): 6709–17. http://dx.doi.org/10.1609/aaai.v37i6.25823.

Pełny tekst źródła

Streszczenie:

We develop a meta-learning framework for simple regret minimization in bandits. In this framework, a learning agent interacts with a sequence of bandit tasks, which are sampled i.i.d. from an unknown prior distribution, and learns its meta-parameters to perform better on future tasks. We propose the first Bayesian and frequentist meta-learning algorithms for this setting. The Bayesian algorithm has access to a prior distribution over the meta-parameters and its meta simple regret over m bandit tasks with horizon n is mere O(m / √n). On the other hand, the meta simple regret of the frequentist algorithm is O(n√m + m/ √n). While its regret is worse, the frequentist algorithm is more general because it does not need a prior distribution over the meta-parameters. It can also be analyzed in more settings. We instantiate our algorithms for several classes of bandit problems. Our algorithms are general and we complement our theory by evaluating them empirically in several environments.

Style APA, Harvard, Vancouver, ISO itp.

Zhou, Huozhi, Lingda Wang, Lav Varshney i Ee-Peng Lim. "A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 04 (3.04.2020): 6933–40. http://dx.doi.org/10.1609/aaai.v34i04.6176.

Pełny tekst źródła

Streszczenie:

We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an efficient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√NKT log T), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of Ω(√NKT), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewise-stationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is Ω(√T). Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of GLR-CUCB compared to other state-of-the-art algorithms.

Style APA, Harvard, Vancouver, ISO itp.

Li, Youxuan. "Improvement of the recommendation system based on the multi-armed bandit algorithm". Applied and Computational Engineering 36, nr 1 (22.01.2024): 237–41. http://dx.doi.org/10.54254/2755-2721/36/20230453.

Pełny tekst źródła

Streszczenie:

In order to effectively solve common problems of the recommendation system, such as the cold start problem and dynamic data modeling problem, the multi-armed bandit (MAB) algorithm, the collaborative filtering (CF) algorithm, and the user information feedback are applied by researchers to update the recommendation model online and in time. In other words, the cold start problem of the recommendation system is transformed into an issue of exploration and utilization. The MAB algorithm is used, user features are introduced as content, and the synergy between users is further considered. In this paper, the author studies the improvement of the recommendation system based on the multi-armed bandit algorithm. The Liner Upper Confidence Bound (LinUCB), Collaborative Filtering Bandits (COFIBA), and Context-Aware clustering of Bandits (CAB) algorithms are analyzed. It is found that the MAB algorithm can get a good maximum total revenue regardless of the content value after going through the cold start stage. In the case of a particularly large amount of content, the CAB algorithm achieves the greatest effect.

Style APA, Harvard, Vancouver, ISO itp.

Kuroki, Yuko, Liyuan Xu, Atsushi Miyauchi, Junya Honda i Masashi Sugiyama. "Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback". Neural Computation 32, nr 9 (wrzesień 2020): 1733–73. http://dx.doi.org/10.1162/neco_a_01299.

Pełny tekst źródła

Streszczenie:

We study the problem of stochastic multiple-arm identification, where an agent sequentially explores a size-[Formula: see text] subset of arms (also known as a super arm) from given [Formula: see text] arms and tries to identify the best super arm. Most work so far has considered the semi-bandit setting, where the agent can observe the reward of each pulled arm or assumed each arm can be queried at each round. However, in real-world applications, it is costly or sometimes impossible to observe a reward of individual arms. In this study, we tackle the full-bandit setting, where only a noisy observation of the total sum of a super arm is given at each pull. Although our problem can be regarded as an instance of the best arm identification in linear bandits, a naive approach based on linear bandits is computationally infeasible since the number of super arms [Formula: see text] is exponential. To cope with this problem, we first design a polynomial-time approximation algorithm for a 0-1 quadratic programming problem arising in confidence ellipsoid maximization. Based on our approximation algorithm, we propose a bandit algorithm whose computation time is [Formula: see text](log [Formula: see text]), thereby achieving an exponential speedup over linear bandit algorithms. We provide a sample complexity upper bound that is still worst-case optimal. Finally, we conduct experiments on large-scale data sets with more than 10[Formula: see text] super arms, demonstrating the superiority of our algorithms in terms of both the computation time and the sample complexity.

Style APA, Harvard, Vancouver, ISO itp.

Niño-Mora, José. "A Fast-Pivoting Algorithm for Whittle’s Restless Bandit Index". Mathematics 8, nr 12 (15.12.2020): 2226. http://dx.doi.org/10.3390/math8122226.

Pełny tekst źródła

Streszczenie:

The Whittle index for restless bandits (two-action semi-Markov decision processes) provides an intuitively appealing optimal policy for controlling a single generic project that can be active (engaged) or passive (rested) at each decision epoch, and which can change state while passive. It further provides a practical heuristic priority-index policy for the computationally intractable multi-armed restless bandit problem, which has been widely applied over the last three decades in multifarious settings, yet mostly restricted to project models with a one-dimensional state. This is due in part to the difficulty of establishing indexability (existence of the index) and of computing the index for projects with large state spaces. This paper draws on the author’s prior results on sufficient indexability conditions and an adaptive-greedy algorithmic scheme for restless bandits to obtain a new fast-pivoting algorithm that computes the n Whittle index values of an n-state restless bandit by performing, after an initialization stage, n steps that entail (2/3)n3+O(n2) arithmetic operations. This algorithm also draws on the parametric simplex method, and is based on elucidating the pattern of parametric simplex tableaux, which allows to exploit special structure to substantially simplify and reduce the complexity of simplex pivoting steps. A numerical study demonstrates substantial runtime speed-ups versus alternative algorithms.

Style APA, Harvard, Vancouver, ISO itp.

Oswal, Urvashi, Aniruddha Bhargava i Robert Nowak. "Linear Bandits with Feature Feedback". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 04 (3.04.2020): 5331–38. http://dx.doi.org/10.1609/aaai.v34i04.5980.

Pełny tekst źródła

Streszczenie:

This paper explores a new form of the linear bandit problem in which the algorithm receives the usual stochastic rewards as well as stochastic feedback about which features are relevant to the rewards, the latter feedback being the novel aspect. The focus of this paper is the development of new theory and algorithms for linear bandits with feature feedback which can achieve regret over time horizon T that scales like k√T, without prior knowledge of which features are relevant nor the number k of relevant features. In comparison, the regret of traditional linear bandits is d√T, where d is the total number of (relevant and irrelevant) features, so the improvement can be dramatic if k ≪ d. The computational complexity of the algorithm is proportional to k rather than d, making it much more suitable for real-world applications compared to traditional linear bandits. We demonstrate the performance of the algorithm with synthetic and real human-labeled data.

Style APA, Harvard, Vancouver, ISO itp.

Agarwal, Mridul, Vaneet Aggarwal, Abhishek Kumar Umrawal i Chris Quinn. "DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 8 (18.05.2021): 6557–65. http://dx.doi.org/10.1609/aaai.v35i8.16812.

Pełny tekst źródła

Streszczenie:

We consider the bandit problem of selecting K out of N arms at each time step. The joint reward can be a non-linear function of the rewards of the selected individual arms. The direct use of a multi-armed bandit algorithm requires choosing among all possible combinations, making the action space large. To simplify the problem, existing works on combinatorial bandits typically assume feedback as a linear function of individual rewards. In this paper, we prove the lower bound for top-K subset selection with bandit feedback with possibly correlated rewards. We present a novel algorithm for the combinatorial setting without using individual arm feedback or requiring linearity of the reward function. Additionally, our algorithm works on correlated rewards of individual arms. Our algorithm, aDaptive Accept RejecT (DART), sequentially finds good arms and eliminates bad arms based on confidence bounds. DART is computationally efficient and uses storage linear in N. Further, DART achieves a regret bound of Õ(K√KNT) for a time horizon T, which matches the lower bound in bandit feedback up to a factor of √log 2NT. When applied to the problem of cross-selling optimization and maximizing the mean of individual rewards, the performance of the proposed algorithm surpasses that of state-of-the-art algorithms. We also show that DART significantly outperforms existing methods for both linear and non-linear joint reward environments.

Style APA, Harvard, Vancouver, ISO itp.

Qu, Jiaming. "Survey of dynamic pricing based on Multi-Armed Bandit algorithms". Applied and Computational Engineering 37, nr 1 (22.01.2024): 160–65. http://dx.doi.org/10.54254/2755-2721/37/20230497.

Pełny tekst źródła

Streszczenie:

Dynamic pricing seeks to determine the most optimal selling price for a product or service, taking into account factors like limited supply and uncertain demand. This study aims to provide a comprehensive exploration of dynamic pricing using the multi-armed bandit problem framework in various contexts. The investigation highlights the prevalence of Thompson sampling in dynamic pricing scenarios with a Bayesian backdrop, where the seller possesses prior knowledge of demand functions. On the other hand, in non-Bayesian situations, the Upper Confidence Bound (UCB) algorithm family gains traction due to their favorable regret bounds. As markets often exhibit temporal fluctuations, the domain of non-stationary multi-armed bandits within dynamic pricing emerges as crucial. Future research directions include enhancing traditional multi-armed bandit algorithms to suit online learning settings, especially those involving dynamic reward distributions. Additionally, merging prior insights into demand functions with contextual multi-armed bandit approaches holds promise for advancing dynamic pricing strategies. In conclusion, this study sheds light on dynamic pricing through the lens of multi-armed bandit problems, offering insights and pathways for further exploration.

Style APA, Harvard, Vancouver, ISO itp.

Wan, Zongqi, Zhijie Zhang, Tongyang Li, Jialin Zhang i Xiaoming Sun. "Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 8 (26.06.2023): 10087–94. http://dx.doi.org/10.1609/aaai.v37i8.26202.

Pełny tekst źródła

Streszczenie:

Multi-arm bandit (MAB) and stochastic linear bandit (SLB) are important models in reinforcement learning, and it is well-known that classical algorithms for bandits with time horizon T suffer from the regret of at least the square root of T. In this paper, we study MAB and SLB with quantum reward oracles and propose quantum algorithms for both models with the order of the polylog T regrets, exponentially improving the dependence in terms of T. To the best of our knowledge, this is the first provable quantum speedup for regrets of bandit problems and in general exploitation in reinforcement learning. Compared to previous literature on quantum exploration algorithms for MAB and reinforcement learning, our quantum input model is simpler and only assumes quantum oracles for each individual arm.

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Rozprawy doktorskie na temat "Algorithme de bandit"

Saadane, Sofiane. "Algorithmes stochastiques pour l'apprentissage, l'optimisation et l'approximation du régime stationnaire". Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30203/document.

Pełny tekst źródła

Streszczenie:

Dans cette thèse, nous étudions des thématiques autour des algorithmes stochastiques et c'est pour cette raison que nous débuterons ce manuscrit par des éléments généraux sur ces algorithmes en donnant des résultats historiques pour poser les bases de nos travaux. Ensuite, nous étudierons un algorithme de bandit issu des travaux de N arendra et Shapiro dont l'objectif est de déterminer parmi un choix de plusieurs sources laquelle profite le plus à l'utilisateur en évitant toutefois de passer trop de temps à tester celles qui sont moins performantes. Notre but est dans un premier temps de comprendre les faiblesses structurelles de cet algorithme pour ensuite proposer une procédure optimale pour une quantité qui mesure les performances d'un algorithme de bandit, le regret. Dans nos résultats, nous proposerons un algorithme appelé NS sur-pénalisé qui permet d'obtenir une borne de regret optimale au sens minimax au travers d'une étude fine de l'algorithme stochastique sous-jacent à cette procédure. Un second travail sera de donner des vitesses de convergence pour le processus apparaissant dans l'étude de la convergence en loi de l'algorithme NS sur-pénalisé. La particularité de l'algorithme est qu'il ne converge pas en loi vers une diffusion comme la plupart des algorithmes stochastiques mais vers un processus à sauts non-diffusif ce qui rend l'étude de la convergence à l'équilibre plus technique. Nous emploierons une technique de couplage afin d'étudier cette convergence. Le second travail de cette thèse s'inscrit dans le cadre de l'optimisation d'une fonction au moyen d'un algorithme stochastique. Nous étudierons une version stochastique de l'algorithme déterministe de boule pesante avec amortissement. La particularité de cet algorithme est d'être articulé autour d'une dynamique qui utilise une moyennisation sur tout le passé de sa trajectoire. La procédure fait appelle à une fonction dite de mémoire qui, selon les formes qu'elle prend, offre des comportements intéressants. Dans notre étude, nous verrons que deux types de mémoire sont pertinents : les mémoires exponentielles et polynomiales. Nous établirons pour commencer des résultats de convergence dans le cas général où la fonction à minimiser est non-convexe. Dans le cas de fonctions fortement convexes, nous obtenons des vitesses de convergence optimales en un sens que nous définirons. Enfin, l'étude se termine par un résultat de convergence en loi du processus après une bonne renormalisation. La troisième partie s'articule autour des algorithmes de McKean-Vlasov qui furent introduit par Anatoly Vlasov et étudié, pour la première fois, par Henry McKean dans l'optique de la modélisation de la loi de distribution du plasma. Notre objectif est de proposer un algorithme stochastique capable d'approcher la mesure invariante du processus. Les méthodes pour approcher une mesure invariante sont connues dans le cas des diffusions et de certains autre processus mais ici la particularité du processus de McKean-Vlasov est de ne pas être une diffusion linéaire. En effet, le processus a de la mémoire comme les processus de boule pesante. De ce fait, il nous faudra développer une méthode alternative pour contourner ce problème. Nous aurons besoin d'introduire la notion de pseudo-trajectoires afin de proposer une procédure efficace
In this thesis, we are studying severa! stochastic algorithms with different purposes and this is why we will start this manuscript by giving historicals results to define the framework of our work. Then, we will study a bandit algorithm due to the work of Narendra and Shapiro whose objectif was to determine among a choice of severa! sources which one is the most profitable without spending too much times on the wrong orres. Our goal is to understand the weakness of this algorithm in order to propose an optimal procedure for a quantity measuring the performance of a bandit algorithm, the regret. In our results, we will propose an algorithm called NS over-penalized which allows to obtain a minimax regret bound. A second work will be to understand the convergence in law of this process. The particularity of the algorith is that it converges in law toward a non-diffusive process which makes the study more intricate than the standard case. We will use coupling techniques to study this process and propose rates of convergence. The second work of this thesis falls in the scope of optimization of a function using a stochastic algorithm. We will study a stochastic version of the so-called heavy bali method with friction. The particularity of the algorithm is that its dynamics is based on the ali past of the trajectory. The procedure relies on a memory term which dictates the behavior of the procedure by the form it takes. In our framework, two types of memory will investigated : polynomial and exponential. We will start with general convergence results in the non-convex case. In the case of strongly convex functions, we will provide upper-bounds for the rate of convergence. Finally, a convergence in law result is given in the case of exponential memory. The third part is about the McKean-Vlasov equations which were first introduced by Anatoly Vlasov and first studied by Henry McKean in order to mode! the distribution function of plasma. Our objective is to propose a stochastic algorithm to approach the invariant distribution of the McKean Vlasov equation. Methods in the case of diffusion processes (and sorne more general pro cesses) are known but the particularity of McKean Vlasov process is that it is strongly non-linear. Thus, we will have to develop an alternative approach. We will introduce the notion of asymptotic pseudotrajectory in odrer to get an efficient procedure

Style APA, Harvard, Vancouver, ISO itp.

Faury, Louis. "Variance-sensitive confidence intervals for parametric and offline bandits". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT046.

Pełny tekst źródła

Streszczenie:

Cette thèse présente des contributions récentes au problème d’optimisation sous feedback bandit, au travers de la construction d’intervalles de confiance sensibles à la variance. Nous traitons deux aspects distincts du problème: (1) la minimisation du regret pour les bandits à modèle linéaire généralisé (GLBs), une large classe de bandits paramétriques non-linéaires et (2) le problème d’optimisation de politique hors ligne sous signal bandit. Concernant (1) nous étudions les effets de la non-linéarité dans les GLBs et remettons en question la compréhension actuelle selon laquelle des hauts niveaux de non-linéarité ne peuvent être que préjudiciables à l’équilibre exploration-exploitation. Des algorithmes améliorés suivis d’une nouvelle méthode d’analyse montrent que lorsque correctement manipulé, le problème de minimisation du regret dans les GLBs n’est pas nécessairement plus dur que pour leur contrepartie linéaire. Il peut même être significativement facilité pour certains membres importants de la famille GLB comme le bandit logistique. Notre approche utilise de nouveaux ensembles de confiance sensibles à la non-linéarité au travers de la variance qu’elle impose à la fonction récompense, accompagnés d’un traitement local de la non-linéarité au travers d’une analyse dite auto-concordante. Concernant (2) nous utilisons des résultats de la littérature de l’optimisation robuste afin de construire des intervalles de confiance asymptotiques sensibles à la variance pour l’évaluation contrefactuelle de politiques. Cela permet d’assurer du conservatisme (désirable pour des agents averses au risque) lors de la recherche hors-ligne de politiques prometteuses. Cet intervalle de confiance engendre de nouveaux objectifs contrefactuels qui sont plus adaptés à des applications pratiques, car convexes et de nature composites
In this dissertation we present recent contributions to the problem of optimization under bandit feedback through the design of variance-sensitive confidence intervals. We tackle two distincts topics: (1) the regret minimization task in Generalized Linear Bandits (GLBs), a broad class of non-linear parametric bandits and (2) the problem of off-line policy optimization under bandit feedback. For (1) we study the effects of non-linearity in GLBs and challenge the current understanding that a high level of non-linearity is detrimental to the exploration-exploitation trade-off. We introduce improved algorithms as well as a novel analysis that prove that if correctly handled, the regret minimization task in GLBs is not necessarily harder than for their linear counterparts. It can even be easier for some important members of the GLB family such as the Logistic Bandit. Our approach leverages a new confidence set which captures the non-linearity of the reward signal through its variance, along with a local treatment of the non-linearity through a so-called self-concordance analysis. For (2) we leverage results from the distributionally robust optimization framework to construct asymptotic variance-sensitive confidence intervals for the counterfactual evaluation of policies. This allows to ensure conservatism (sought out by risk-averse agents) while searching off-line for promising policies. Our confidence intervals lead to new counterfactual objectives which, contrary to their predecessors, are more suited for practical deployment thanks to their convex and composite natures

Style APA, Harvard, Vancouver, ISO itp.

Sani, Amir. "Apprentissage automatique pour la prise de décisions". Thesis, Lille 1, 2015. http://www.theses.fr/2015LIL10038/document.

Pełny tekst źródła

Streszczenie:

La prise de décision stratégique concernant des ressources de valeur devrait tenir compte du degré d'aversion au risque. D'ailleurs, de nombreux domaines d'application mettent le risque au cœur de la prise de décision. Toutefois, ce n'est pas le cas de l'apprentissage automatique. Ainsi, il semble essentiel de devoir fournir des indicateurs et des algorithmes dotant l'apprentissage automatique de la possibilité de prendre en considération le risque dans la prise de décision. En particulier, nous souhaiterions pouvoir estimer ce dernier sur de courtes séquences dépendantes générées à partir de la classe la plus générale possible de processus stochastiques en utilisant des outils théoriques d'inférence statistique et d'aversion au risque dans la prise de décision séquentielle. Cette thèse étudie ces deux problèmes en fournissant des méthodes algorithmiques prenant en considération le risque dans le cadre de la prise de décision en apprentissage automatique. Un algorithme avec des performances de pointe est proposé pour une estimation précise des statistiques de risque avec la classe la plus générale de processus ergodiques et stochastiques. De plus, la notion d'aversion au risque est introduite dans la prise de décision séquentielle (apprentissage en ligne) à la fois dans les jeux de bandits stochastiques et dans l'apprentissage séquentiel antagoniste
Strategic decision-making over valuable resources should consider risk-averse objectives. Many practical areas of application consider risk as central to decision-making. However, machine learning does not. As a result, research should provide insights and algorithms that endow machine learning with the ability to consider decision-theoretic risk. In particular, in estimating decision-theoretic risk on short dependent sequences generated from the most general possible class of processes for statistical inference and through decision-theoretic risk objectives in sequential decision-making. This thesis studies these two problems to provide principled algorithmic methods for considering decision-theoretic risk in machine learning. An algorithm with state-of-the-art performance is introduced for accurate estimation of risk statistics on the most general class of stationary--ergodic processes and risk-averse objectives are introduced in sequential decision-making (online learning) in both the stochastic multi-arm bandit setting and the adversarial full-information setting

Style APA, Harvard, Vancouver, ISO itp.

Clement, Benjamin. "Adaptive Personalization of Pedagogical Sequences using Machine Learning". Thesis, Bordeaux, 2018. http://www.theses.fr/2018BORD0373/document.

Pełny tekst źródła

Streszczenie:

Les ordinateurs peuvent-ils enseigner ? Pour répondre à cette question, la recherche dans les Systèmes Tuteurs Intelligents est en pleine expansion parmi la communauté travaillant sur les Technologies de l'Information et de la Communication pour l'Enseignement (TICE). C'est un domaine qui rassemble différentes problématiques et réunit des chercheurs venant de domaines variés, tels que la psychologie, la didactique, les neurosciences et, plus particulièrement, le machine learning. Les technologies numériques deviennent de plus en plus présentes dans la vie quotidienne avec le développement des tablettes et des smartphones. Il semble naturel d'utiliser ces technologies dans un but éducatif. Cela amène de nombreuses problématiques, telles que comment faire des interfaces accessibles à tous, comment rendre des contenus pédagogiques motivants ou encore comment personnaliser les activités afin d'adapter le contenu à chacun. Au cours de cette thèse, nous avons développé des méthodes, regroupées dans un framework nommé HMABITS, afin d'adapter des séquences d'activités pédagogiques en fonction des performances et des préférences des apprenants, dans le but de maximiser leur vitesse d'apprentissage et leur motivation. Ces méthodes utilisent des modèles computationnels de motivation intrinsèque pour identifier les activités offrant les plus grands progrès d'apprentissage, et utilisent des algorithmes de Bandits Multi-Bras pour gérer le compromis exploration/exploitation à l'intérieur de l'espace d'activité. Les activités présentant un intérêt optimal sont ainsi privilégiées afin de maintenir l'apprenant dans un état de Flow ou dans sa Zone de Développement Proximal. De plus, certaines de nos méthodes permettent à l'apprenant de faire des choix sur des caractéristiques contextuelles ou le contenu pédagogique de l'application, ce qui est un vecteur d'autodétermination et de motivation. Afin d'évaluer l'efficacité et la pertinence de nos algorithmes, nous avons mené plusieurs types d'expérimentation. Nos méthodes ont d'abord été testées en simulation afin d'évaluer leur fonctionnement avant de les utiliser dans d'actuelles applications d'apprentissage. Pour ce faire, nous avons développé différents modèles d'apprenants, afin de pouvoir éprouver nos méthodes selon différentes approches, un modèle d'apprenant virtuel ne reflétant jamais le comportement d'un apprenant réel. Les résultats des simulations montrent que le framework HMABITS permet d'obtenir des résultats d'apprentissage comparables et, dans certains cas, meilleurs qu'une solution optimale ou qu'une séquence experte. Nous avons ensuite développé notre propre scénario pédagogique et notre propre serious game afin de tester nos algorithmes en situation réelle avec de vrais élèves. Nous avons donc développé un jeu sur la thématique de la décomposition des nombres, au travers de la manipulation de la monnaie, pour les enfants de 6 à 8 ans. Nous avons ensuite travaillé avec le rectorat et différentes écoles de l'académie de bordeaux. Sur l'ensemble des expérimentations, environ 1000 élèves ont travaillé sur l'application sur tablette. Les résultats des études en situation réelle montrent que le framework HMABITS permet aux élèves d'accéder à des activités plus diverses et plus difficiles, d'avoir un meilleure apprentissage et d'être plus motivés qu'avec une séquence experte. Les résultats montrent même que ces effets sont encore plus marqués lorsque les élèves ont la possibilité de faire des choix
Can computers teach people? To answer this question, Intelligent Tutoring Systems are a rapidly expanding field of research among the Information and Communication Technologies for the Education community. This subject brings together different issues and researchers from various fields, such as psychology, didactics, neurosciences and, particularly, machine learning. Digital technologies are becoming more and more a part of everyday life with the development of tablets and smartphones. It seems natural to consider using these technologies for educational purposes. This raises several questions, such as how to make user interfaces accessible to everyone, how to make educational content motivating and how to customize it to individual learners. In this PhD, we developed methods, grouped in the aptly-named HMABITS framework, to adapt pedagogical activity sequences based on learners' performances and preferences to maximize their learning speed and motivation. These methods use computational models of intrinsic motivation and curiosity-driven learning to identify the activities providing the highest learning progress and use Multi-Armed Bandit algorithms to manage the exploration/exploitation trade-off inside the activity space. Activities of optimal interest are thus privileged with the target to keep the learner in a state of Flow or in his or her Zone of Proximal Development. Moreover, some of our methods allow the student to make choices about contextual features or pedagogical content, which is a vector of self-determination and motivation. To evaluate the effectiveness and relevance of our algorithms, we carried out several types of experiments. We first evaluated these methods with numerical simulations before applying them to real teaching conditions. To do this, we developed multiple models of learners, since a single model never exactly replicates the behavior of a real learner. The simulation results show the HMABITS framework achieves comparable, and in some cases better, learning results than an optimal solution or an expert sequence. We then developed our own pedagogical scenario and serious game to test our algorithms in classrooms with real students. We developed a game on the theme of number decomposition, through the manipulation of money, for children aged 6 to 8. We then worked with the educational institutions and several schools in the Bordeaux school district. Overall, about 1000 students participated in trial lessons using the tablet application. The results of the real-world studies show that the HMABITS framework allows the students to do more diverse and difficult activities, to achieve better learning and to be more motivated than with an Expert Sequence. The results show that this effect is even greater when the students have the possibility to make choices

Style APA, Harvard, Vancouver, ISO itp.

Maillard, Odalric-Ambrym. "APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00845410.

Pełny tekst źródła

Streszczenie:

Cette thèse traite des domaines suivant en Apprentissage Automatique: la théorie des Bandits, l'Apprentissage statistique et l'Apprentissage par renforcement. Son fil rouge est l'étude de plusieurs notions d'adaptation, d'un point de vue non asymptotique : à un environnement ou à un adversaire dans la partie I, à la structure d'un signal dans la partie II, à la structure de récompenses ou à un modèle des états du monde dans la partie III. Tout d'abord nous dérivons une analyse non asymptotique d'un algorithme de bandit à plusieurs bras utilisant la divergence de Kullback-Leibler. Celle-ci permet d'atteindre, dans le cas de distributions à support fini, la borne inférieure de performance asymptotique dépendante des distributions de probabilité connue pour ce problème. Puis, pour un bandit avec un adversaire possiblement adaptatif, nous introduisons des modèles dépendants de l'histoire et traduisant une possible faiblesse de l'adversaire et montrons comment en tirer parti pour concevoir des algorithmes adaptatifs à cette faiblesse. Nous contribuons au problème de la régression en montrant l'utilité des projections aléatoires, à la fois sur le plan théorique et pratique, lorsque l'espace d'hypothèses considéré est de dimension grande, voire infinie. Nous utilisons également des opérateurs d'échantillonnage aléatoires dans le cadre de la reconstruction parcimonieuse lorsque la base est loin d'être orthogonale. Enfin, nous combinons la partie I et II : pour fournir une analyse non-asymptotique d'algorithmes d'apprentissage par renforcement; puis, en amont du cadre des Processus Décisionnel de Markov, pour discuter du problème pratique du choix d'un bon modèle d'états.

Style APA, Harvard, Vancouver, ISO itp.

Dorard, L. R. M. "Bandit algorithms for searching large spaces". Thesis, University College London (University of London), 2012. http://discovery.ucl.ac.uk/1348319/.

Pełny tekst źródła

Streszczenie:

Bandit games consist of single-state environments in which an agent must sequentially choose actions to take, for which rewards are given. The objective being to maximise the cumulated reward, the agent naturally seeks to build a model of the relationship between actions and rewards. The agent must both choose uncertain actions in order to improve its model (exploration), and actions that are believed to yield high rewards according to the model (exploitation). The choice of an action to take is called a play of an arm of the bandit, and the total number of plays may or may not be known in advance. Algorithms designed to handle the exploration-exploitation dilemma were initially motivated by problems with rather small numbers of actions. But the ideas they were based on have been extended to cases where the number of actions to choose from is much larger than the maximum possible number of plays. Several problems fall into this setting, such as information retrieval with relevance feedback, where the system must learn what a user is looking for while serving relevant documents often enough, but also global optimisation, where the search for an optimum is done by selecting where to acquire potentially expensive samples of a target function. All have in common the search of large spaces. In this thesis, we focus on an algorithm based on the Gaussian Processes probabilistic model, often used in Bayesian optimisation, and the Upper Confidence Bound action-selection heuristic that is popular in bandit algorithms. In addition to demonstrating the advantages of the GP-UCB algorithm on an image retrieval problem, we show how it can be adapted in order to search tree-structured spaces. We provide an efficient implementation, theoretical guarantees on the algorithm's performance, and empirical evidence that it handles large branching factors better than previous bandit-based algorithms, on synthetic trees.

Style APA, Harvard, Vancouver, ISO itp.

Jedor, Matthieu. "Bandit algorithms for recommender system optimization". Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM027.

Pełny tekst źródła

Streszczenie:

Dans cette thèse de doctorat, nous étudions l'optimisation des systèmes de recommandation dans le but de fournir des suggestions de produits plus raffinées pour un utilisateur.La tâche est modélisée à l'aide du cadre des bandits multi-bras.Dans une première partie, nous abordons deux problèmes qui se posent fréquemment dans les systèmes de recommandation : le grand nombre d'éléments à traiter et la gestion des contenus sponsorisés.Dans une deuxième partie, nous étudions les performances empiriques des algorithmes de bandit et en particulier comment paramétrer les algorithmes traditionnels pour améliorer les résultats dans les environnements stationnaires et non stationnaires qui l'on rencontre en pratique.Cela nous amène à analyser à la fois théoriquement et empiriquement l'algorithme glouton qui, dans certains cas, est plus performant que l'état de l'art
In this PhD thesis, we study the optimization of recommender systems with the objective of providing more refined suggestions of items for a user to benefit.The task is modeled using the multi-armed bandit framework.In a first part, we look upon two problems that commonly occured in recommendation systems: the large number of items to handle and the management of sponsored contents.In a second part, we investigate the empirical performance of bandit algorithms and especially how to tune conventional algorithm to improve results in stationary and non-stationary environments that arise in practice.This leads us to analyze both theoretically and empirically the greedy algorithm that, in some cases, outperforms the state-of-the-art

Style APA, Harvard, Vancouver, ISO itp.

Besson, Lilian. "Multi-Players Bandit Algorithms for Internet of Things Networks". Thesis, CentraleSupélec, 2019. http://www.theses.fr/2019CSUP0005.

Pełny tekst źródła

Streszczenie:

Dans cette thèse de doctorat, nous étudions les réseaux sans fil et les appareils reconfigurables qui peuvent accéder à des réseaux de type radio intelligente, dans des bandes non licenciées et sans supervision centrale. Nous considérons notamment des réseaux actuels ou futurs de l’Internet des Objets (IoT), avec l’objectif d’augmenter la durée de vie de la batterie des appareils, en les équipant d’algorithmes d’apprentissage machine peu coûteux mais efficaces, qui leur permettent d’améliorer automatiquement l’efficacité de leurs communications sans fil. Nous proposons deux modèles de réseaux IoT, et nous montrons empiriquement, par des simulations numériques et une validation expérimentale réaliste, le gain que peuvent apporter nos méthodes, qui se reposent sur l’apprentissage par renforcement. Les différents problèmes d’accès au réseau sont modélisés avec des Bandits Multi-Bras (MAB), mais l’analyse de la convergence d’un grand nombre d’appareils jouant à un jeu collaboratif sans communication ni aucune coordination reste délicate, lorsque les appareils suivent tous un modèle d’activation aléatoire. Le reste de ce manuscrit étudie donc deux modèles restreints, d’abord des banditsmulti-joueurs dans des problèmes stationnaires, puis des bandits mono-joueur non stationnaires. Nous détaillons également une autre contribution, la bibliothèque Python open-source SMPyBandits, qui permet des simulations numériques de problèmes MAB, qui couvre les modèles étudiés et d’autres
In this PhD thesis, we study wireless networks and reconfigurable end-devices that can access Cognitive Radio networks, in unlicensed bands and without central control. We focus on Internet of Things networks (IoT), with the objective of extending the devices’ battery life, by equipping them with low-cost but efficient machine learning algorithms, in order to let them automatically improve the efficiency of their wireless communications. We propose different models of IoT networks, and we show empirically on both numerical simulations and real-world validation the possible gain of our methods, that use Reinforcement Learning. The different network access problems are modeled as Multi-Armed Bandits (MAB), but we found that analyzing the realistic models was intractable, because proving the convergence of many IoT devices playing a collaborative game, without communication nor coordination is hard, when they all follow random activation patterns. The rest of this manuscript thus studies two restricted models, first multi-players bandits in stationary problems, then non-stationary single-player bandits. We also detail another contribution, SMPyBandits, our open-source Python library for numerical MAB simulations, that covers all the studied models and more

Style APA, Harvard, Vancouver, ISO itp.

Deffayet, Romain. "Bandit Algorithms for Adaptive Modulation and Coding in Wireless Networks". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281884.

Pełny tekst źródła

Streszczenie:

The demand for quality cellular network coverage has been increasing significantly in the recent years and will continue its progression throughout the near future. This results from an increase of transmitted data, because of new use cases (HD videos, live streaming, online games, ...), but also from a diversification of the traffic, notably because of shorter and more frequent transmissions which can be due to IOT devices or other telemetry applications. The cellular networks are becoming increasingly complex, and the need for better management of the network’s properties is higher than ever. The combined effect of these two paradigms creates a trade-off : whereas one would like to design algorithms that achieve high performance decision-making, one would also like those to be able to do so in any settings that can be encountered in this complex network. Instead, this thesis proposes to restrict the scope of the decision-making algorithms through on-line learning. The thesis focuses on the context of initial MCS selection in Adaptive Modulation and Coding, in which one must choose an initial transmission rate guaranteeing fast communications and low error rate. We formulate the problem as a Reinforcement Learning problem, and propose relevant restrictions to simpler frameworks like Multi-Armed Bandits and Contextual Bandits. Eight bandit algorithms are tested and reviewed with emphasis on practical applications. The thesis shows that a Reinforcement Learning agent can improve the utilization of the link capacity between the transmitter and the receiver. First, we present a cell-wide Multi-Armed Bandit agent, which learns the optimal initial offset in a given cell, and then a contextual augmentation of this agent taking user-specific features as input. The proposed method achieves with burst traffic an 8% increase of the median throughput and 65% reduction of the median regret in the first 0:5s of transmission, when compared to a fixed baseline.
Efterfrågan på mobilnät av hög kvalitet har ökat mycket de senaste åren och kommer att fortsätta öka under en nära framtid. Detta är resultatet av en ökad mängd trafik på grund av nya användningsfall (HD-videor, live streaming, onlinespel, ...) men kommer också från en diversifiering av trafiken, i synnerhet på grund av kortare och mer frekventa sändningar vilka kan vara på grund av IOT-enheter eller andra telemetri-applikationer. Mobilnätet blir allt komplexare och behovet av bättre hantering av nätverkets egenskaper är högre än någonsin. Den kombinerade effekten av dessa två paradigmer skapar en avvägning: medan man vill utforma algoritmer som uppnår mycket hög prestanda vid beslutsfattning, skulle man också vilja att algoritmerna kan göra det i alla konfigurationer som kan uppstå i detta komplexa nätverk. Istället föreslår denna avhandling att begränsa omfattningen av beslutsalgoritmerna genom att introducera online-inlärning. Avhandlingen fokuserar på första MCS-valet i Adaptiv Modulering och Kodning, där man måste välja en initial överföringshastighet som garanterar snabb kommunikation och minsta möjliga transmissionsfel. Vi formulerar problemet som ett Reinforcement Learning problem och föreslår relevanta begränsningar för matematikt enklare ramverk som Multi-Armed Bandits och Contextual Bandits. Åtta banditalgoritmer testas och granskas med hänsyn till praktisk tillämpning. Avhandlingen visar att en Reinforcement Learning agent kan förbättra användningen av länkkapaciteten mellan sändare och mottagare. Först presenterar vi en Multi-Armed Bandit agent på cell-nivå, som lär sig den optimala initiala MCSen i en given cell och sedan en kontextuell utvidgning av dennaa agent med användarspecifika funktioner. Den föreslagna metoden uppnår en åttaprocentig (8%) ökning av medianhastigheten och en sextiofemprocentig (65%) minskning av median ångern vid skurvis trafik det första 0.5s av tranmissionen, jämfört med ett fast referensvärde.

Style APA, Harvard, Vancouver, ISO itp.

Degenne, Rémy. "Impact of structure on the design and analysis of bandit algorithms". Thesis, Université de Paris (2019-....), 2019. http://www.theses.fr/2019UNIP7179.

Pełny tekst źródła

Streszczenie:

Cette thèse porte sur des problèmes d'apprentissage statistique séquentiel, dits bandits stochastiques à plusieurs bras. Dans un premier temps un algorithme de bandit est présenté. L'analyse de cet algorithme, comme la majorité des preuves usuelles de bornes de regret pour algorithmes de bandits, utilise des intervalles de confiance pour les moyennes des bras. Dans un cadre paramétrique,on prouve des inégalités de concentration quantifiant la déviation entre le paramètre d'une distribution et son estimation empirique, afin d'obtenir de tels intervalles. Ces inégalités sont exprimées en fonction de la divergence de Kullback-Leibler. Trois extensions du problème de bandits sont ensuite étudiées. Premièrement on considère le problème dit de semi-bandit combinatoire, dans lequel un algorithme choisit un ensemble de bras et la récompense de chaque bras est observée. Le regret minimal atteignable dépend alors de la corrélation entre les bras. On considère ensuite un cadre où on change le mécanisme d'obtention des observations provenant des différents bras. Une source de difficulté du problème de bandits est la rareté de l'information: seul le bras choisi est observé. On montre comment on peut tirer parti de la disponibilité d'observations supplémentaires gratuites, ne participant pas au regret. Enfin, une nouvelle famille d'algorithmes est présentée afin d'obtenir à la fois des guaranties de minimisation de regret et d'identification du meilleur bras. Chacun des algorithmes réalise un compromis entre regret et temps d'identification. On se penche dans un deuxième temps sur le problème dit d'exploration pure, dans lequel un algorithme n'est pas évalué par son regret mais par sa probabilité d'erreur quant à la réponse à une question posée sur le problème. On détermine la complexité de tels problèmes et on met au point des algorithmes approchant cette complexité
In this Thesis, we study sequential learning problems called stochastic multi-armed bandits. First a new bandit algorithm is presented. The analysis of that algorithm uses confidence intervals on the mean of the arms reward distributions, as most bandit proofs do. In a parametric setting, we derive concentration inequalities which quantify the deviation between the mean parameter of a distribution and its empirical estimation in order to obtain confidence intervals. These inequalities are presented as bounds on the Kullback-Leibler divergence. Three extensions of the stochastic multi-armed bandit problem are then studied. First we study the so-called combinatorial semi-bandit problem, in which an algorithm chooses a set of arms and the reward of each of these arms is observed. The minimal attainable regret then depends on the correlation between the arm distributions. We consider then a setting in which the observation mechanism changes. One source of difficulty of the bandit problem is the scarcity of information: only the arm pulled is observed. We show how to use efficiently eventual supplementary free information (which do not influence the regret). Finally a new family of algorithms is introduced to obtain both regret minimization and est arm identification regret guarantees. Each algorithm of the family realizes a trade-off between regret and time needed to identify the best arm. In a second part we study the so-called pure exploration problem, in which an algorithm is not evaluated on its regret but on the probability that it returns a wrong answer to a question on the arm distributions. We determine the complexity of such problems and design with performance close to that complexity

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Książki na temat "Algorithme de bandit"

Braun, Kathrin, i Cordula Kropp, red. In digitaler Gesellschaft. Bielefeld, Germany: transcript Verlag, 2021. http://dx.doi.org/10.14361/9783839454534.

Pełny tekst źródła

Streszczenie:

Wie verändern sich gesellschaftliche Praktiken und die Chancen demokratischer Technikgestaltung, wenn neben Bürger*innen und Öffentlichkeit auch Roboter, Algorithmen, Simulationen oder selbstlernende Systeme einbezogen und als Beteiligte ernstgenommen werden? Die Beiträger*innen des Bandes untersuchen die Neukonfiguration von Verantwortung und Kontrolle, Wissen, Beteiligungsansprüchen und Kooperationsmöglichkeiten im Umgang mit intelligenten Systemen wie smart grids, Servicerobotern, Routenplanern, Finanzmarktalgorithmen und anderen soziodigitalen Arrangements. Aufgezeigt wird, wie die digitalen »Neulinge« dazu beitragen, die Gestaltungsmöglichkeiten für Demokratie, Inklusion und Nachhaltigkeit zu verändern und Macht- und Kraftverhältnisse zu verschieben.

Style APA, Harvard, Vancouver, ISO itp.

Block, Katharina, Anne Deremetz, Anna Henkel i Malte Rehbein, red. 10 Minuten Soziologie: Digitalisierung. Bielefeld, Germany: transcript Verlag, 2022. http://dx.doi.org/10.14361/9783839457108.

Pełny tekst źródła

Streszczenie:

Vom Algorithmus bis zum Sensor umfasst die Digitalisierung eine Vielfalt technologischer Innovationen. Ebenso facettenreich sind die Dimensionen, in denen sie die Gesellschaft transformiert und gleichzeitig von ihr geprägt wird. Die Auswirkungen auf die Kommunikation im öffentlichen Raum, auf die Wissenschaft und Landwirtschaft sowie die Wechselwirkungen mit dem Recht, der Wirtschaft und der Ökologie - die Beitragenden des Bandes gehen diesen und anderen Aspekten von Digitalisierung aus verschiedenen theoretischen Blickwinkeln nach. Damit eröffnen sie Perspektiven, die Digitalisierung als sozio-technischen Wandel verstehen und erklären lassen.

Style APA, Harvard, Vancouver, ISO itp.

Bandit Algorithms. Cambridge University Press, 2020.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Lattimore, Tor. Bandit Algorithms. University of Cambridge ESOL Examinations, 2020.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

White, John Myles. Bandit Algorithms for Website Optimization. O'Reilly Media, Incorporated, 2012.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Dorota Głowacka. Bandit Algorithms in Information Retrieval. Now Publishers, 2019.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

White, John Myles. Bandit Algorithms for Website Optimization. O'Reilly Media, Incorporated, 2012.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Bandit Algorithms for Website Optimization: Developing, Deploying, and Debugging. O'Reilly Media, 2012.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Verständig, Dan, Christina Kast, Janne Stricker i Andreas Nürnberger, red. Algorithmen und Autonomie. Verlag Barbara Budrich, 2022. http://dx.doi.org/10.3224/84742520.

Pełny tekst źródła

Streszczenie:

Wir leben in einer Welt der algorithmischen Sortierung und Entscheidungsfindung. Mathematische Modelle kuratieren unsere sozialen Beziehungen, beeinflussen unsere Wahlen und entscheiden sogar darüber, ob wir ins Gefängnis gehen sollten oder nicht. Aber wie viel wissen wir wirklich über Code, algorithmische Strukturen und deren Wirkweisen? Der Band wendet sich den Fragen der Autonomie im digitalen Zeitalter aus einer interdisziplinären Perspektive zu, indem er Beiträge aus Philosophie, Erziehungs- und Kulturwissenschaft mit der Informatik verbindet.

Style APA, Harvard, Vancouver, ISO itp.

Beyer, Elena, Katharina Erler, Christoph Hartmann, Malte Kramme, Michael F. Müller, Tereza Pertot, Elif Tuna i Felix M. Wilke, red. Privatrecht 2050 - Blick in die digitale Zukunft. Nomos Verlagsgesellschaft mbH & Co. KG, 2020. http://dx.doi.org/10.5771/9783748901723.

Pełny tekst źródła

Streszczenie:

Der Band enthält die Beiträge der 30. Jahrestagung der Gesellschaft Junge Zivilrechtswissenschaft e.V., die im September 2019 von Bayreuther Nachwuchswissenschaftlern ausgerichtet wurde. Die Beiträge befassen sich mit den Auswirkungen des digitalen Wandels auf die Entwicklung des Privatrechts in den kommenden Jahrzehnten und behandeln die folgenden Themenfelder: Legal Tech: Grenzen der Personalisierung dispositiven Rechts, Möglichkeiten zur Formalisierung des Rechts und zur automatischen Subsumtion, Rechtsdienstleistungen durch Online-Inkassodienste Vertragsrecht: vertragsrechtliche Erfassung der Plattformwirtschaft, Beteiligung künstlicher Intelligenz in Vertragsverhältnissen sowie Haftungsfragen im Internet der Dinge Sachenrecht: Übertragung von Bitcoins Gesellschaftsrecht: EU Company Law Package und die digitalisierte GmbH, virtuelle Hauptversammlung Prozessrecht: digitale Beweismittel und Smart Enforcement Diskriminierung durch Algorithmen Datenschutzverletzung als Wettbewerbsverstoß <b>Mit Beiträgen von </b> Martin Schmidt-Kessel, Philip Maximilian Bender, Johannes Klug, Sören Segger-Piening, Johannes Warter, Julia Grinzinger, Dimitrios Linardatos, Lena Maute, Miriam Kullmann, Ralf Knaier, Patrick Nutz, Miriam Buiten, Julia Harten, David Markworth, Lukas Klever, Julian Rapp

Style APA, Harvard, Vancouver, ISO itp.

Części książek na temat "Algorithme de bandit"

Cesa-Bianchi, Nicolò. "Multi-armed Bandit Problem". W Encyclopedia of Algorithms, 1356–59. New York, NY: Springer New York, 2016. http://dx.doi.org/10.1007/978-1-4939-2864-4_768.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Cesa-Bianchi, Nicolò. "Multi-armed Bandit Problem". W Encyclopedia of Algorithms, 1–5. Boston, MA: Springer US, 2014. http://dx.doi.org/10.1007/978-3-642-27848-8_768-1.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Audibert, Jean-Yves, Rémi Munos i Csaba Szepesvári. "Tuning Bandit Algorithms in Stochastic Environments". W Lecture Notes in Computer Science, 150–65. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-75225-7_15.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Hendel, Gregor, Matthias Miltenberger i Jakob Witzig. "Adaptive Algorithmic Behavior for Solving Mixed Integer Programs Using Bandit Algorithms". W Operations Research Proceedings, 513–19. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-18500-8_64.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Poland, Jan. "FPL Analysis for Adaptive Bandits". W Stochastic Algorithms: Foundations and Applications, 58–69. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11571155_7.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Vermorel, Joannès, i Mehryar Mohri. "Multi-armed Bandit Algorithms and Empirical Evaluation". W Machine Learning: ECML 2005, 437–48. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11564096_42.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Caelen, Olivier, i Gianluca Bontempi. "Improving the Exploration Strategy in Bandit Algorithms". W Lecture Notes in Computer Science, 56–68. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-92695-5_5.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Tyagi, Hemant, i Bernd Gärtner. "Continuum Armed Bandit Problem of Few Variables in High Dimensions". W Approximation and Online Algorithms, 108–19. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-08001-7_10.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Shminke, Boris. "gym-saturation: Gymnasium Environments for Saturation Provers (System description)". W Lecture Notes in Computer Science, 187–99. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-43513-3_11.

Pełny tekst źródła

Streszczenie:

AbstractThis work describes a new version of a previously published Python package — : a collection of OpenAI Gym environments for guiding saturation-style provers based on the given clause algorithm with reinforcement learning. We contribute usage examples with two different provers: Vampire and iProver. We also have decoupled the proof state representation from reinforcement learning per se and provided examples of using a known Python code embedding model as a first-order logic representation. In addition, we demonstrate how environment wrappers can transform a prover into a problem similar to a multi-armed bandit. We applied two reinforcement learning algorithms (Thompson sampling and Proximal policy optimisation) implemented in Ray RLlib to show the ease of experimentation with the new release of our package.

Style APA, Harvard, Vancouver, ISO itp.

Viappiani, Paolo. "Thompson Sampling for Bayesian Bandits with Resets". W Algorithmic Decision Theory, 399–410. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-41575-3_31.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Algorithme de bandit"

Bouneffouf, Djallel, Irina Rish, Guillermo Cecchi i Raphaël Féraud. "Context Attentive Bandits: Contextual Bandit with Restricted Context". W Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/203.

Pełny tekst źródła

Streszczenie:

We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation is motivated by different online problems arising in clinical trials, recommender systems and attention modeling.Herein, we adapt the standard multi-armed bandit algorithm known as Thompson Sampling to take advantage of our restricted context setting, and propose two novel algorithms, called the Thompson Sampling with Restricted Context (TSRC) and the Windows Thompson Sampling with Restricted Context (WTSRC), for handling stationary and nonstationary environments, respectively. Our empirical results demonstrate advantages of the proposed approaches on several real-life datasets.

Style APA, Harvard, Vancouver, ISO itp.

Gao, Ruijiang, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Min Kyung Lee i Matthew Lease. "Human-AI Collaboration with Bandit Feedback". W Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/237.

Pełny tekst źródła

Streszczenie:

Human-machine complementarity is important when neither the algorithm nor the human yield dominant performance across all instances in a given domain. Most research on algorithmic decision-making solely centers on the algorithm's performance, while recent work that explores human-machine collaboration has framed the decision-making problems as classification tasks. In this paper, we first propose and then develop a solution for a novel human-machine collaboration problem in a bandit feedback setting. Our solution aims to exploit the human-machine complementarity to maximize decision rewards. We then extend our approach to settings with multiple human decision makers. We demonstrate the effectiveness of our proposed methods using both synthetic and real human responses, and find that our methods outperform both the algorithm and the human when they each make decisions on their own. We also show how personalized routing in the presence of multiple human decision-makers can further improve the human-machine team performance.

Style APA, Harvard, Vancouver, ISO itp.

Liu, Fang, Sinong Wang, Swapna Buccapatnam i Ness Shroff. "UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits". W Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/338.

Pełny tekst źródła

Streszczenie:

In this work, we address the open problem of finding low-complexity near-optimal multi-armed bandit algorithms for sequential decision making problems. Existing bandit algorithms are either sub-optimal and computationally simple (e.g., UCB1) or optimal and computationally complex (e.g., kl-UCB). We propose a boosting approach to Upper Confidence Bound based algorithms for stochastic bandits, that we call UCBoost. Specifically, we propose two types of UCBoost algorithms. We show that UCBoost(D) enjoys O(1) complexity for each arm per round as well as regret guarantee that is 1/e-close to that of the kl-UCB algorithm. We propose an approximation-based UCBoost algorithm, UCBoost(epsilon), that enjoys a regret guarantee epsilon-close to that of kl-UCB as well as O(log(1/epsilon)) complexity for each arm per round. Hence, our algorithms provide practitioners a practical way to trade optimality with computational complexity. Finally, we present numerical results which show that UCBoost(epsilon) can achieve the same regret performance as the standard kl-UCB while incurring only 1% of the computational cost of kl-UCB.

Style APA, Harvard, Vancouver, ISO itp.

Gupta, Samarth, Shreyas Chaudhari, Subhojyoti Mukherjee, Gauri Joshi i Osman Yagan. "A Unified Approach to Translate Classical Bandit Algorithms to Structured Bandits". W ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413628.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Xie, Miao, Wotao Yin i Huan Xu. "AutoBandit: A Meta Bandit Online Learning System". W Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/719.

Pełny tekst źródła

Streszczenie:

Recently online multi-armed bandit (MAB) is growing rapidly, as novel problem settings and algorithms motivated by various practical applications are being studied, building on the top of the classic bandit problem. However, identifying the best bandit algorithm from lots of potential candidates for a given application is not only time-consuming but also relying on human expertise, which hinders the practicality of MAB. To alleviate this problem, this paper outlines an intelligent system called AutoBandit, equipped with many out-of-the-box MAB algorithms, for automatically and adaptively choosing the best with suitable hyper-parameters online. It is effective to help a growing application for continuously maximizing cumulative rewards of its whole life-cycle. With a flexible architecture and user-friendly web-based interfaces, it is very convenient for the user to integrate and monitor online bandits in a business system. At the time of publication, AutoBandit has been deployed for various industrial applications.

Style APA, Harvard, Vancouver, ISO itp.

Guo, Xueying, Xiaoxiao Wang i Xin Liu. "AdaLinUCB: Opportunistic Learning for Contextual Bandits". W Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/336.

Pełny tekst źródła

Streszczenie:

In this paper, we propose and study opportunistic contextual bandits - a special case of contextual bandits where the exploration cost varies under different environmental conditions, such as network load or return variation in recommendations. When the exploration cost is low, so is the actual regret of pulling a sub-optimal arm (e.g., trying a suboptimal recommendation). Therefore, intuitively, we could explore more when the exploration cost is relatively low and exploit more when the exploration cost is relatively high. Inspired by this intuition, for opportunistic contextual bandits with Linear payoffs, we propose an Adaptive Upper-Confidence-Bound algorithm (AdaLinUCB) to adaptively balance the exploration-exploitation trade-off for opportunistic learning. We prove that AdaLinUCB achieves O((log T)^2) problem-dependent regret upper bound, which has a smaller coefficient than that of the traditional LinUCB algorithm. Moreover, based on both synthetic and real-world dataset, we show that AdaLinUCB significantly outperforms other contextual bandit algorithms, under large exploration cost fluctuations.

Style APA, Harvard, Vancouver, ISO itp.

Yang, Peng, Peilin Zhao i Xin Gao. "Bandit Online Learning on Graphs via Adaptive Optimization". W Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/415.

Pełny tekst źródła

Streszczenie:

Traditional online learning on graphs adapts graph Laplacian into ridge regression, which may not guarantee reasonable accuracy when the data are adversarially generated. To solve this issue, we exploit an adaptive optimization framework for online classification on graphs. The derived model can achieve a min-max regret under an adversarial mechanism of data generation. To take advantage of the informative labels, we propose an adaptive large-margin update rule, which enjoys a lower regret than the algorithms using error-driven update rules. However, this algorithm assumes that the full information label is provided for each node, which is violated in many practical applications where labeling is expensive and the oracle may only tell whether the prediction is correct or not. To address this issue, we propose a bandit online algorithm on graphs. It derives per-instance confidence region of the prediction, from which the model can be learned adaptively to minimize the online regret. Experiments on benchmark graph datasets show that the proposed bandit algorithm outperforms state-of-the-art competitors, even sometimes beats the algorithms using full information label feedback.

Style APA, Harvard, Vancouver, ISO itp.

Peng, Yi, Miao Xie, Jiahao Liu, Xuying Meng, Nan Li, Cheng Yang, Tao Yao i Rong Jin. "A Practical Semi-Parametric Contextual Bandit". W Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/450.

Pełny tekst źródła

Streszczenie:

Classic multi-armed bandit algorithms are inefficient for a large number of arms. On the other hand, contextual bandit algorithms are more efficient, but they suffer from a large regret due to the bias of reward estimation with finite dimensional features. Although recent studies proposed semi-parametric bandits to overcome these defects, they assume arms' features are constant over time. However, this assumption rarely holds in practice, since real-world problems often involve underlying processes that are dynamically evolving over time especially for the special promotions like Singles' Day sales. In this paper, we formulate a novel Semi-Parametric Contextual Bandit Problem to relax this assumption. For this problem, a novel Two-Steps Upper-Confidence Bound framework, called Semi-Parametric UCB (SPUCB), is presented. It can be flexibly applied to linear parametric function problem with a satisfied gap-free bound on the n-step regret. Moreover, to make our method more practical in online system, an optimization is proposed for dealing with high dimensional features of a linear function. Extensive experiments on synthetic data as well as a real dataset from one of the largest e-commercial platforms demonstrate the superior performance of our algorithm.

Style APA, Harvard, Vancouver, ISO itp.

Ou, Mingdong, Nan Li, Shenghuo Zhu i Rong Jin. "Multinomial Logit Bandit with Linear Utility Functions". W Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/361.

Pełny tekst źródła

Streszczenie:

Multinomial logit bandit is a sequential subset selection problem which arises in many applications. In each round, the player selects a K-cardinality subset from N candidate items, and receives a reward which is governed by a multinomial logit (MNL) choice model considering both item utility and substitution property among items. The player's objective is to dynamically learn the parameters of MNL model and maximize cumulative reward over a finite horizon T. This problem faces the exploration-exploitation dilemma, and the involved combinatorial nature makes it non-trivial. In recent years, there have developed some algorithms by exploiting specific characteristics of the MNL model, but all of them estimate the parameters of MNL model separately and incur a regret bound which is not preferred for large candidate set size N. In this paper, we consider the linear utility MNL choice model whose item utilities are represented as linear functions of d-dimension item features, and propose an algorithm, titled LUMB, to exploit the underlying structure. It is proven that the proposed algorithm achieves regret which is free of candidate set size. Experiments show the superiority of the proposed algorithm.

Style APA, Harvard, Vancouver, ISO itp.

Carlsson, Emil, Devdatt Dubhashi i Fredrik D. Johansson. "Thompson Sampling for Bandits with Clustered Arms". W Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/305.

Pełny tekst źródła

Streszczenie:

We propose algorithms based on a multi-level Thompson sampling scheme, for the stochastic multi-armed bandit and its contextual variant with linear expected rewards, in the setting where arms are clustered. We show, both theoretically and empirically, how exploiting a given cluster structure can significantly improve the regret and computational cost compared to using standard Thompson sampling. In the case of the stochastic multi-armed bandit we give upper bounds on the expected cumulative regret showing how it depends on the quality of the clustering. Finally, we perform an empirical evaluation showing that our algorithms perform well compared to previously proposed algorithms for bandits with clustered arms.

Style APA, Harvard, Vancouver, ISO itp.

Raporty organizacyjne na temat "Algorithme de bandit"

Marty, Frédéric, i Thierry Warin. Deciphering Algorithmic Collusion: Insights from Bandit Algorithms and Implications for Antitrust Enforcement. CIRANO, grudzień 2023. http://dx.doi.org/10.54932/iwpg7510.

Pełny tekst źródła

Streszczenie:

This paper examines algorithmic collusion from legal and economic perspectives, highlighting the growing role of algorithms in digital markets and their potential for anti-competitive behavior. Using bandit algorithms as a model, traditionally applied in uncertain decision-making contexts, we illuminate the dynamics of implicit collusion without overt communication. Legally, the challenge is discerning and classifying these algorithmic signals, especially as unilateral communications. Economically, distinguishing between rational pricing and collusive patterns becomes intricate with algorithm-driven decisions. The paper emphasizes the imperative for competition authorities to identify unusual market behaviors, hinting at shifting the burden of proof to firms with algorithmic pricing. Balancing algorithmic transparency and collusion prevention is crucial. While regulations might address these concerns, they could hinder algorithmic development. As this form of collusion becomes central in antitrust, understanding through models like bandit algorithms is vital, since these last ones may converge faster towards an anticompetitive equilibrium.

Style APA, Harvard, Vancouver, ISO itp.

Johansen, Richard A., Christina L. Saltus, Molly K. Reif i Kaytee L. Pokrzywinski. A Review of Empirical Algorithms for the Detection and Quantification of Harmful Algal Blooms Using Satellite-Borne Remote Sensing. U.S. Army Engineer Research and Development Center, czerwiec 2022. http://dx.doi.org/10.21079/11681/44523.

Pełny tekst źródła

Streszczenie:

Harmful Algal Blooms (HABs) continue to be a global concern, especially since predicting bloom events including the intensity, extent, and geographic location, remain difficult. However, remote sensing platforms are useful tools for monitoring HABs across space and time. The main objective of this review was to explore the scientific literature to develop a near-comprehensive list of spectrally derived empirical algorithms for satellite imagers commonly utilized for the detection and quantification HABs and water quality indicators. This review identified the 29 WorldView-2 MSI algorithms, 25 Sentinel-2 MSI algorithms, 32 Landsat-8 OLI algorithms, 9 MODIS algorithms, and 64 MERIS/Sentinel-3 OLCI algorithms. This review also revealed most empirical-based algorithms fell into one of the following general formulas: two-band difference algorithm (2BDA), three-band difference algorithm (3BDA), normalized-difference chlorophyll index (NDCI), or the cyanobacterial index (CI). New empirical algorithm development appears to be constrained, at least in part, due to the limited number of HAB-associated spectral features detectable in currently operational imagers. However, these algorithms provide a foundation for future algorithm development as new sensors, technologies, and platforms emerge.

Style APA, Harvard, Vancouver, ISO itp.

Kwong, Man Kam. Sweeping algorithms for five-point stencils and banded matrices. Office of Scientific and Technical Information (OSTI), czerwiec 1992. http://dx.doi.org/10.2172/10160879.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kwong, Man Kam. Sweeping algorithms for five-point stencils and banded matrices. Office of Scientific and Technical Information (OSTI), czerwiec 1992. http://dx.doi.org/10.2172/7276272.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Alwan, Iktimal, Dennis D. Spencer i Rafeed Alkawadri. Comparison of Machine Learning Algorithms in Sensorimotor Functional Mapping. Progress in Neurobiology, grudzień 2023. http://dx.doi.org/10.60124/j.pneuro.2023.30.03.

Pełny tekst źródła

Streszczenie:

Objective: To compare the performance of popular machine learning algorithms (ML) in mapping the sensorimotor cortex (SM) and identifying the anterior lip of the central sulcus (CS). Methods: We evaluated support vector machines (SVMs), random forest (RF), decision trees (DT), single layer perceptron (SLP), and multilayer perceptron (MLP) against standard logistic regression (LR) to identify the SM cortex employing validated features from six-minute of NREM sleep icEEG data and applying standard common hyperparameters and 10-fold cross-validation. Each algorithm was tested using vetted features based on the statistical significance of classical univariate analysis (p<0.05) and extended () 17 features representing power/coherence of different frequency bands, entropy, and interelectrode-based distance. The analysis was performed before and after weight adjustment for imbalanced data (w). Results: 7 subjects and 376 contacts were included. Before optimization, ML algorithms performed comparably employing conventional features (median CS accuracy: 0.89, IQR [0.88-0.9]). After optimization, neural networks outperformed others in means of accuracy (MLP: 0.86), the area under the curve (AUC) (SLPw, MLPw, MLP: 0.91), recall (SLPw: 0.82, MLPw: 0.81), precision (SLPw: 0.84), and F1-scores (SLPw: 0.82). SVM achieved the best specificity performance. Extending the number of features and adjusting the weights improved recall, precision, and F1-scores by 48.27%, 27.15%, and 39.15%, respectively, with gains or no significant losses in specificity and AUC across CS and Function (correlation r=0.71 between the two clinical scenarios in all performance metrics, p<0.001). Interpretation: Computational passive sensorimotor mapping is feasible and reliable. Feature extension and weight adjustments improve the performance and counterbalance the accuracy paradox. Optimized neural networks outperform other ML algorithms even in binary classification tasks. The best-performing models and the MATLAB® routine employed in signal processing are available to the public at (Link 1).

Style APA, Harvard, Vancouver, ISO itp.

Lumsdaine, A., J. White, D. Webber i A. Sangiovanni-Vincentelli. A Band Relaxation Algorithm for Reliable and Parallelizable Circuit Simulation. Fort Belvoir, VA: Defense Technical Information Center, sierpień 1988. http://dx.doi.org/10.21236/ada200783.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Chen, Z., S. E. Grasby, C. Deblonde i X. Liu. AI-enabled remote sensing data interpretation for geothermal resource evaluation as applied to the Mount Meager geothermal prospective area. Natural Resources Canada/CMSS/Information Management, 2022. http://dx.doi.org/10.4095/330008.

Pełny tekst źródła

Streszczenie:

The objective of this study is to search for features and indicators from the identified geothermal resource sweet spot in the south Mount Meager area that are applicable to other volcanic complexes in the Garibaldi Volcanic Belt. A Landsat 8 multi-spectral band dataset, for a total of 57 images ranging from visible through infrared to thermal infrared frequency channels and covering different years and seasons, were selected. Specific features that are indicative of high geothermal heat flux, fractured permeable zones, and groundwater circulation, the three key elements in exploring for geothermal resource, were extracted. The thermal infrared images from different seasons show occurrence of high temperature anomalies and their association with volcanic and intrusive bodies, and reveal the variation in location and intensity of the anomalies with time over four seasons, allowing inference of specific heat transform mechanisms. Automatically extracted linear features using AI/ML algorithms developed for computer vision from various frequency bands show various linear segment groups that are likely surface expression associated with local volcanic activities, regional deformation and slope failure. In conjunction with regional structural models and field observations, the anomalies and features from remotely sensed images were interpreted to provide new insights for improving our understanding of the Mount Meager geothermal system and its characteristics. After validation, the methods developed and indicators identified in this study can be applied to other volcanic complexes in the Garibaldi, or other volcanic belts for geothermal resource reconnaissance.

Style APA, Harvard, Vancouver, ISO itp.

Anderson, Gerald L., i Kalman Peleg. Precision Cropping by Remotely Sensed Prorotype Plots and Calibration in the Complex Domain. United States Department of Agriculture, grudzień 2002. http://dx.doi.org/10.32747/2002.7585193.bard.

Pełny tekst źródła

Streszczenie:

This research report describes a methodology whereby multi-spectral and hyperspectral imagery from remote sensing, is used for deriving predicted field maps of selected plant growth attributes which are required for precision cropping. A major task in precision cropping is to establish areas of the field that differ from the rest of the field and share a common characteristic. Yield distribution f maps can be prepared by yield monitors, which are available for some harvester types. Other field attributes of interest in precision cropping, e.g. soil properties, leaf Nitrate, biomass etc. are obtained by manual sampling of the filed in a grid pattern. Maps of various field attributes are then prepared from these samples by the "Inverse Distance" interpolation method or by Kriging. An improved interpolation method was developed which is based on minimizing the overall curvature of the resulting map. Such maps are the ground truth reference, used for training the algorithm that generates the predicted field maps from remote sensing imagery. Both the reference and the predicted maps are stratified into "Prototype Plots", e.g. 15xl5 blocks of 2m pixels whereby the block size is 30x30m. This averaging reduces the datasets to manageable size and significantly improves the typically poor repeatability of remote sensing imaging systems. In the first two years of the project we used the Normalized Difference Vegetation Index (NDVI), for generating predicted yield maps of sugar beets and com. The NDVI was computed from image cubes of three spectral bands, generated by an optically filtered three camera video imaging system. A two dimensional FFT based regression model Y=f(X), was used wherein Y was the reference map and X=NDVI was the predictor. The FFT regression method applies the "Wavelet Based", "Pixel Block" and "Image Rotation" transforms to the reference and remote images, prior to the Fast - Fourier Transform (FFT) Regression method with the "Phase Lock" option. A complex domain based map Yfft is derived by least squares minimization between the amplitude matrices of X and Y, via the 2D FFT. For one time predictions, the phase matrix of Y is combined with the amplitude matrix ofYfft, whereby an improved predicted map Yplock is formed. Usually, the residuals of Y plock versus Y are about half of the values of Yfft versus Y. For long term predictions, the phase matrix of a "field mask" is combined with the amplitude matrices of the reference image Y and the predicted image Yfft. The field mask is a binary image of a pre-selected region of interest in X and Y. The resultant maps Ypref and Ypred aremodified versions of Y and Yfft respectively. The residuals of Ypred versus Ypref are even lower than the residuals of Yplock versus Y. The maps, Ypref and Ypred represent a close consensus of two independent imaging methods which "view" the same target. In the last two years of the project our remote sensing capability was expanded by addition of a CASI II airborne hyperspectral imaging system and an ASD hyperspectral radiometer. Unfortunately, the cross-noice and poor repeatability problem we had in multi-spectral imaging was exasperated in hyperspectral imaging. We have been able to overcome this problem by over-flying each field twice in rapid succession and developing the Repeatability Index (RI). The RI quantifies the repeatability of each spectral band in the hyperspectral image cube. Thereby, it is possible to select the bands of higher repeatability for inclusion in the prediction model while bands of low repeatability are excluded. Further segregation of high and low repeatability bands takes place in the prediction model algorithm, which is based on a combination of a "Genetic Algorithm" and Partial Least Squares", (PLS-GA). In summary, modus operandi was developed, for deriving important plant growth attribute maps (yield, leaf nitrate, biomass and sugar percent in beets), from remote sensing imagery, with sufficient accuracy for precision cropping applications. This achievement is remarkable, given the inherently high cross-noice between the reference and remote imagery as well as the highly non-repeatable nature of remote sensing systems. The above methodologies may be readily adopted by commercial companies, which specialize in proving remotely sensed data to farmers.

Style APA, Harvard, Vancouver, ISO itp.

Borges, Carlos F., i Craig S. Peters. An Algorithm for Computing the Stationary Distribution of a Discrete-Time Birth-and-Death Process with Banded Infinitesimal Generator. Fort Belvoir, VA: Defense Technical Information Center, kwiecień 1995. http://dx.doi.org/10.21236/ada295810.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Terrill, Eric J. X-band Observations of Waves, Algorithm Development, and Validation High Resolution Wave-Air-Sea Interaction DRI. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 2012. http://dx.doi.org/10.21236/ada574656.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!