Gotowa bibliografia na temat „Apprentissage par renforcement distributionnel”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Apprentissage par renforcement distributionnel”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Apprentissage par renforcement distributionnel"
Griffon, L., M. Chennaoui, D. Leger i M. Strauss. "Apprentissage par renforcement dans la narcolepsie de type 1". Médecine du Sommeil 15, nr 1 (marzec 2018): 60. http://dx.doi.org/10.1016/j.msom.2018.01.164.
Pełny tekst źródłaGarcia, Pascal. "Exploration guidée en apprentissage par renforcement. Connaissancesa prioriet relaxation de contraintes". Revue d'intelligence artificielle 20, nr 2-3 (1.06.2006): 235–75. http://dx.doi.org/10.3166/ria.20.235-275.
Pełny tekst źródłaDegris, Thomas, Olivier Sigaud i Pierre-Henri Wuillemin. "Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs". Revue d'intelligence artificielle 23, nr 2-3 (13.05.2009): 221–51. http://dx.doi.org/10.3166/ria.23.221-251.
Pełny tekst źródłaHost, Shirley, i Nicolas Sabouret. "Apprentissage par renforcement d'actes de communication dans un système multi-agent". Revue d'intelligence artificielle 24, nr 2 (17.04.2010): 159–88. http://dx.doi.org/10.3166/ria.24.159-188.
Pełny tekst źródłaVillatte, Matthieu, David Scholiers i Esteve Freixa i Baqué. "Apprentissage du comportement optimal par exposition aux contingences dans le dilemme de Monty Hall". ACTA COMPORTAMENTALIA 12, nr 1 (1.06.2004): 5–24. http://dx.doi.org/10.32870/ac.v12i1.14548.
Pełny tekst źródłaCHIALI, Ramzi. "Le texte littéraire comme référentiel préférentiel dans le renforcement de la compétence interculturelle en contexte institutionnel. Réflexion et dynamique didactique." Revue plurilingue : Études des Langues, Littératures et Cultures 7, nr 1 (14.07.2023): 70–78. http://dx.doi.org/10.46325/ellic.v7i1.99.
Pełny tekst źródłaAltintas, Gulsun, i Isabelle Royer. "Renforcement de la résilience par un apprentissage post-crise : une étude longitudinale sur deux périodes de turbulence". M@n@gement 12, nr 4 (2009): 266. http://dx.doi.org/10.3917/mana.124.0266.
Pełny tekst źródłaDutech, Alain, i Manuel Samuelides. "Apprentissage par renforcement pour les processus décisionnels de Markov partiellement observés Apprendre une extension sélective du passé". Revue d'intelligence artificielle 17, nr 4 (1.08.2003): 559–89. http://dx.doi.org/10.3166/ria.17.559-589.
Pełny tekst źródłaScholiers, David, i Matthieu Villatte. "Comportement Non-optimal versus Illusion Cognitive". ACTA COMPORTAMENTALIA 11, nr 1 (1.06.2003): 5–17. http://dx.doi.org/10.32870/ac.v11i1.14611.
Pełny tekst źródłaBOUCHET, N., L. FRENILLOT, M. DELAHAYE, M. GAILLARD, P. MESTHE, E. ESCOURROU i L. GIMENEZ. "GESTION DES EMOTIONS VECUES PAR LES ETUDIANTS EN 3E CYCLE DE MEDECINE GENERALE DE TOULOUSE AU COURS DE LA PRISE EN CHARGE DES PATIENTS : ETUDE QUALITATIVE". EXERCER 34, nr 192 (1.04.2023): 184–90. http://dx.doi.org/10.56746/exercer.2023.192.184.
Pełny tekst źródłaRozprawy doktorskie na temat "Apprentissage par renforcement distributionnel"
Hêche, Félicien. "Risk-sensitive machine learning for emergency medical resource optimization and other applications". Electronic Thesis or Diss., Bourgogne Franche-Comté, 2024. http://www.theses.fr/2024UBFCD048.
Pełny tekst źródłaThe significant increase in demand for emergency medical care over the last decades places considerable strain on Emergency Medical Services (EMS), leading to several undesirable effects. Motivated by the remarkable results obtained by modern Machine Learning (ML) algorithms, this thesis primarily explores the use of ML for optimizing EMS resources, aiming to address some of the challenges faced by this healthcare system. The first contribution of the thesis introduces a new Reinforcement Learning (RL) algorithm, called Latent Offline Distributional Actor-Critic (LODAC), specifically designed to satisfy key criteria essential for ensuring safe and effective behavior in the management of EMS resources. Following that, several experiments are conducted to identify the most important features that need to be incorporated into our state representation. Findings suggest that only the time significantly affects the occurrence of emergencies. These results argue for the use of stochastic methods rather than ML to optimize pre-hospital resources. Following these conclusions, new methods for EMS resource allocation and relocation based on inhomogeneous Poisson processes are developed. Finally, results obtained with LODAC suggest the potential of distributional RL in stochastic environments. To further investigate this avenue, we isolate the central distributional component of LODAC and conduct a series of experiments with this algorithm in another challenging stochastic context: natural gas futures trading. The outcomes of these experiments underscore the effectiveness of distributional RL in such environments
Achab, Mastane. "Ranking and risk-aware reinforcement learning". Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT020.
Pełny tekst źródłaThis thesis divides into two parts: the first part is on ranking and the second on risk-aware reinforcement learning. While binary classification is the flagship application of empirical risk minimization (ERM), the main paradigm of machine learning, more challenging problems such as bipartite ranking can also be expressed through that setup. In bipartite ranking, the goal is to order, by means of scoring methods, all the elements of some feature space based on a training dataset composed of feature vectors with their binary labels. This thesis extends this setting to the continuous ranking problem, a variant where the labels are taking continuous values instead of being simply binary. The analysis of ranking data, initiated in the 18th century in the context of elections, has led to another ranking problem using ERM, namely ranking aggregation and more precisely the Kemeny's consensus approach. From a training dataset made of ranking data, such as permutations or pairwise comparisons, the goal is to find the single "median permutation" that best corresponds to a consensus order. We present a less drastic dimensionality reduction approach where a distribution on rankings is approximated by a simpler distribution, which is not necessarily reduced to a Dirac mass as in ranking aggregation.For that purpose, we rely on mathematical tools from the theory of optimal transport such as Wasserstein metrics. The second part of this thesis focuses on risk-aware versions of the stochastic multi-armed bandit problem and of reinforcement learning (RL), where an agent is interacting with a dynamic environment by taking actions and receiving rewards, the objective being to maximize the total payoff. In particular, a novel atomic distributional RL approach is provided: the distribution of the total payoff is approximated by particles that correspond to trimmed means
Zimmer, Matthieu. "Apprentissage par renforcement développemental". Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008/document.
Pełny tekst źródłaReinforcement learning allows an agent to learn a behavior that has never been previously defined by humans. The agent discovers the environment and the different consequences of its actions through its interaction: it learns from its own experience, without having pre-established knowledge of the goals or effects of its actions. This thesis tackles how deep learning can help reinforcement learning to handle continuous spaces and environments with many degrees of freedom in order to solve problems closer to reality. Indeed, neural networks have a good scalability and representativeness. They make possible to approximate functions on continuous spaces and allow a developmental approach, because they require little a priori knowledge on the domain. We seek to reduce the amount of necessary interaction of the agent to achieve acceptable behavior. To do so, we proposed the Neural Fitted Actor-Critic framework that defines several data efficient actor-critic algorithms. We examine how the agent can fully exploit the transitions generated by previous behaviors by integrating off-policy data into the proposed framework. Finally, we study how the agent can learn faster by taking advantage of the development of his body, in particular, by proceeding with a gradual increase in the dimensionality of its sensorimotor space
Zimmer, Matthieu. "Apprentissage par renforcement développemental". Electronic Thesis or Diss., Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008.
Pełny tekst źródłaReinforcement learning allows an agent to learn a behavior that has never been previously defined by humans. The agent discovers the environment and the different consequences of its actions through its interaction: it learns from its own experience, without having pre-established knowledge of the goals or effects of its actions. This thesis tackles how deep learning can help reinforcement learning to handle continuous spaces and environments with many degrees of freedom in order to solve problems closer to reality. Indeed, neural networks have a good scalability and representativeness. They make possible to approximate functions on continuous spaces and allow a developmental approach, because they require little a priori knowledge on the domain. We seek to reduce the amount of necessary interaction of the agent to achieve acceptable behavior. To do so, we proposed the Neural Fitted Actor-Critic framework that defines several data efficient actor-critic algorithms. We examine how the agent can fully exploit the transitions generated by previous behaviors by integrating off-policy data into the proposed framework. Finally, we study how the agent can learn faster by taking advantage of the development of his body, in particular, by proceeding with a gradual increase in the dimensionality of its sensorimotor space
Kozlova, Olga. "Apprentissage par renforcement hiérarchique et factorisé". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2010. http://tel.archives-ouvertes.fr/tel-00632968.
Pełny tekst źródłaFilippi, Sarah. "Stratégies optimistes en apprentissage par renforcement". Phd thesis, Ecole nationale supérieure des telecommunications - ENST, 2010. http://tel.archives-ouvertes.fr/tel-00551401.
Pełny tekst źródłaThéro, Héloïse. "Contrôle, agentivité et apprentissage par renforcement". Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE028/document.
Pełny tekst źródłaSense of agency or subjective control can be defined by the feeling that we control our actions, and through them effects in the outside world. This cluster of experiences depend on the ability to learn action-outcome contingencies and a more classical algorithm to model this originates in the field of human reinforcementlearning. In this PhD thesis, we used the cognitive modeling approach to investigate further the interaction between perceived control and reinforcement learning. First, we saw that participants undergoing a reinforcement-learning task experienced higher agency; this influence of reinforcement learning on agency comes as no surprise, because reinforcement learning relies on linking a voluntary action and its outcome. But our results also suggest that agency influences reinforcement learning in two ways. We found that people learn actionoutcome contingencies based on a default assumption: their actions make a difference to the world. Finally, we also found that the mere fact of choosing freely shapes the learning processes following that decision. Our general conclusion is that agency and reinforcement learning, two fundamental fields of human psychology, are deeply intertwined. Contrary to machines, humans do care about being in control, or about making the right choice, and this results in integrating information in a one-sided way
Munos, Rémi. "Apprentissage par renforcement, étude du cas continu". Paris, EHESS, 1997. http://www.theses.fr/1997EHESA021.
Pełny tekst źródłaLesner, Boris. "Planification et apprentissage par renforcement avec modèles d'actions compacts". Caen, 2011. http://www.theses.fr/2011CAEN2074.
Pełny tekst źródłaWe study Markovian Decision Processes represented with Probabilistic STRIPS action models. A first part of our work is about solving those processes in a compact way. To that end we propose two algorithms. A first one based on propositional formula manipulation allows to obtain approximate solutions in tractable propositional fragments such as Horn and 2-CNF. The second algorithm solves exactly and efficiently problems represented in PPDDL using a new notion of extended value functions. The second part is about learning such action models. We propose different approaches to solve the problem of ambiguous observations occurring while learning. Firstly, a heuristic method based on Linear Programming gives good results in practice yet without theoretical guarantees. We next describe a learning algorithm in the ``Know What It Knows'' framework. This approach gives strong theoretical guarantees on the quality of the learned models as well on the sample complexity. These two approaches are then put into a Reinforcement Learning setting to allow an empirical evaluation of their respective performances
Maillard, Odalric-Ambrym. "APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00845410.
Pełny tekst źródłaKsiążki na temat "Apprentissage par renforcement distributionnel"
Sutton, Richard S. Reinforcement learning: An introduction. Cambridge, Mass: MIT Press, 1998.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Sciences de l'activité physique pse4u cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Technologie de l'information en affaires btx4e cours préemploi. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Études informatiques ics4m cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Mathématiques de la technologie au collège mct4c cours précollégial. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Sciences snc4m cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: English eae4e cours préemploi. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Le Canada et le monde: une analyse géographique cgw4u cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Environnement et gestion des ressources cgr4e cours préemploi. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaOntario. Esquisse de cours 12e année: Histoire de l'Occident et du monde chy4c cours précollégial. Vanier, Ont: CFORP, 2002.
Znajdź pełny tekst źródłaCzęści książek na temat "Apprentissage par renforcement distributionnel"
Tazdaït, Tarik, i Rabia Nessah. "5. Vote et apprentissage par renforcement". W Le paradoxe du vote, 157–77. Éditions de l’École des hautes études en sciences sociales, 2013. http://dx.doi.org/10.4000/books.editionsehess.1931.
Pełny tekst źródłaBENDELLA, Mohammed Salih, i Badr BENMAMMAR. "Impact de la radio cognitive sur le green networking : approche par apprentissage par renforcement". W Gestion du niveau de service dans les environnements émergents. ISTE Group, 2020. http://dx.doi.org/10.51926/iste.9002.ch8.
Pełny tekst źródłaRaporty organizacyjne na temat "Apprentissage par renforcement distributionnel"
Melloni, Gian. Le leadership des autorités locales en matière d'assainissement et d'hygiène : expériences et apprentissage de l'Afrique de l'Ouest. Institute of Development Studies (IDS), styczeń 2022. http://dx.doi.org/10.19088/slh.2022.002.
Pełny tekst źródła