Literatura académica sobre el tema "Apprentissage par renforcement mulitagent"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Apprentissage par renforcement mulitagent".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Artículos de revistas sobre el tema "Apprentissage par renforcement mulitagent"
Griffon, L., M. Chennaoui, D. Leger y M. Strauss. "Apprentissage par renforcement dans la narcolepsie de type 1". Médecine du Sommeil 15, n.º 1 (marzo de 2018): 60. http://dx.doi.org/10.1016/j.msom.2018.01.164.
Texto completoGarcia, Pascal. "Exploration guidée en apprentissage par renforcement. Connaissancesa prioriet relaxation de contraintes". Revue d'intelligence artificielle 20, n.º 2-3 (1 de junio de 2006): 235–75. http://dx.doi.org/10.3166/ria.20.235-275.
Texto completoDegris, Thomas, Olivier Sigaud y Pierre-Henri Wuillemin. "Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs". Revue d'intelligence artificielle 23, n.º 2-3 (13 de mayo de 2009): 221–51. http://dx.doi.org/10.3166/ria.23.221-251.
Texto completoHost, Shirley y Nicolas Sabouret. "Apprentissage par renforcement d'actes de communication dans un système multi-agent". Revue d'intelligence artificielle 24, n.º 2 (17 de abril de 2010): 159–88. http://dx.doi.org/10.3166/ria.24.159-188.
Texto completoVillatte, Matthieu, David Scholiers y Esteve Freixa i Baqué. "Apprentissage du comportement optimal par exposition aux contingences dans le dilemme de Monty Hall". ACTA COMPORTAMENTALIA 12, n.º 1 (1 de junio de 2004): 5–24. http://dx.doi.org/10.32870/ac.v12i1.14548.
Texto completoCHIALI, Ramzi. "Le texte littéraire comme référentiel préférentiel dans le renforcement de la compétence interculturelle en contexte institutionnel. Réflexion et dynamique didactique." Revue plurilingue : Études des Langues, Littératures et Cultures 7, n.º 1 (14 de julio de 2023): 70–78. http://dx.doi.org/10.46325/ellic.v7i1.99.
Texto completoAltintas, Gulsun y Isabelle Royer. "Renforcement de la résilience par un apprentissage post-crise : une étude longitudinale sur deux périodes de turbulence". M@n@gement 12, n.º 4 (2009): 266. http://dx.doi.org/10.3917/mana.124.0266.
Texto completoDutech, Alain y Manuel Samuelides. "Apprentissage par renforcement pour les processus décisionnels de Markov partiellement observés Apprendre une extension sélective du passé". Revue d'intelligence artificielle 17, n.º 4 (1 de agosto de 2003): 559–89. http://dx.doi.org/10.3166/ria.17.559-589.
Texto completoScholiers, David y Matthieu Villatte. "Comportement Non-optimal versus Illusion Cognitive". ACTA COMPORTAMENTALIA 11, n.º 1 (1 de junio de 2003): 5–17. http://dx.doi.org/10.32870/ac.v11i1.14611.
Texto completoBOUCHET, N., L. FRENILLOT, M. DELAHAYE, M. GAILLARD, P. MESTHE, E. ESCOURROU y L. GIMENEZ. "GESTION DES EMOTIONS VECUES PAR LES ETUDIANTS EN 3E CYCLE DE MEDECINE GENERALE DE TOULOUSE AU COURS DE LA PRISE EN CHARGE DES PATIENTS : ETUDE QUALITATIVE". EXERCER 34, n.º 192 (1 de abril de 2023): 184–90. http://dx.doi.org/10.56746/exercer.2023.192.184.
Texto completoTesis sobre el tema "Apprentissage par renforcement mulitagent"
Dinneweth, Joris. "Vers des approches hybrides fondées sur l'émergence et l'apprentissage : prise en compte des véhicules autonomes dans le trafic". Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG099.
Texto completoAccording to the World Health Organization, road accidents cause almost 1.2 million deaths and 40 million injuries each year. In wealthy countries, safety standards prevent a large proportion of accidents. The remaining accidents are caused by human behavior. For this reason, some are planning to automate road traffic, i.e., to replace humans as drivers of their vehicles. However, automating road traffic can hardly be achieved overnight. Thus, driving robots (DRs) and human drivers could cohabit in mixed traffic. Our thesis focuses on the safety issues that may arise due to behavioral differences between DRs and human drivers. DRs are designed to respect formal norms, those of the Highway Code. Human drivers, on the other hand, are opportunistic, not hesitating to break formal norms and adopt new, informal ones. The emergence of new behaviors in traffic can make it more heterogeneous and encourage accidents caused by misinterpretation of these new behaviors. We believe that minimizing this behavioral heterogeneity would reduce the above risks. Therefore, our thesis proposes a decision-making model of DR whose behavior is intended to be close to non-hazardous human practices, in order to minimize the heterogeneity between RC and human driver behavior, and with the aim of promoting their acceptance by the latter. To achieve this, we will adopt a multidisciplinary approach, inspired by studies in driving psychology and combining traffic simulation, multi-agent reinforcement learning (MARL). MARL consists of learning a behavior by trial and error guided by a utility function. Thanks to its ability to generalize, especially via neural networks, MARL can be adapted to any environment, including traffic. We will use it to teach our decision model robust behavior in the face of the diversity of traffic situations. To avoid incidents, DR manufacturers could design relatively homogeneous and defensive behaviors rather than opportunistic ones. However, this approach risks making DRs predictable and, therefore, vulnerable to opportunistic behavior by human drivers. The consequences could then be detrimental to both traffic fluidity and safety. Our first contribution aims at reproducing heterogeneous traffic, i.e., where each vehicle exhibits a unique behavior. We assume that by making the behavior of DRs heterogeneous, their predictability will be reduced and opportunistic human drivers will be less able to anticipate their actions. Therefore, this paradigm considers the behavioral heterogeneity of DRs as a critical feature for the safety and fluidity of mixed traffic. In an experimental phase, we will demonstrate the ability of our model to produce heterogeneous behavior while meeting some of the challenges of MARL. Our second contribution will be the integration of informal norms into the decision processes of our DR decision model. We will focus exclusively on integrating the notion of social orientation value, which describes individuals' social behaviors such as altruism or selfishness. Starting with a highway merging scenario, we will evaluate the impact of social orientation on the fluidity and safety of merging vehicles. We will show that altruism can improve safety, but that its actual impact is highly dependent on traffic density
Zimmer, Matthieu. "Apprentissage par renforcement développemental". Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008/document.
Texto completoReinforcement learning allows an agent to learn a behavior that has never been previously defined by humans. The agent discovers the environment and the different consequences of its actions through its interaction: it learns from its own experience, without having pre-established knowledge of the goals or effects of its actions. This thesis tackles how deep learning can help reinforcement learning to handle continuous spaces and environments with many degrees of freedom in order to solve problems closer to reality. Indeed, neural networks have a good scalability and representativeness. They make possible to approximate functions on continuous spaces and allow a developmental approach, because they require little a priori knowledge on the domain. We seek to reduce the amount of necessary interaction of the agent to achieve acceptable behavior. To do so, we proposed the Neural Fitted Actor-Critic framework that defines several data efficient actor-critic algorithms. We examine how the agent can fully exploit the transitions generated by previous behaviors by integrating off-policy data into the proposed framework. Finally, we study how the agent can learn faster by taking advantage of the development of his body, in particular, by proceeding with a gradual increase in the dimensionality of its sensorimotor space
Zimmer, Matthieu. "Apprentissage par renforcement développemental". Electronic Thesis or Diss., Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008.
Texto completoReinforcement learning allows an agent to learn a behavior that has never been previously defined by humans. The agent discovers the environment and the different consequences of its actions through its interaction: it learns from its own experience, without having pre-established knowledge of the goals or effects of its actions. This thesis tackles how deep learning can help reinforcement learning to handle continuous spaces and environments with many degrees of freedom in order to solve problems closer to reality. Indeed, neural networks have a good scalability and representativeness. They make possible to approximate functions on continuous spaces and allow a developmental approach, because they require little a priori knowledge on the domain. We seek to reduce the amount of necessary interaction of the agent to achieve acceptable behavior. To do so, we proposed the Neural Fitted Actor-Critic framework that defines several data efficient actor-critic algorithms. We examine how the agent can fully exploit the transitions generated by previous behaviors by integrating off-policy data into the proposed framework. Finally, we study how the agent can learn faster by taking advantage of the development of his body, in particular, by proceeding with a gradual increase in the dimensionality of its sensorimotor space
Kozlova, Olga. "Apprentissage par renforcement hiérarchique et factorisé". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2010. http://tel.archives-ouvertes.fr/tel-00632968.
Texto completoFilippi, Sarah. "Stratégies optimistes en apprentissage par renforcement". Phd thesis, Ecole nationale supérieure des telecommunications - ENST, 2010. http://tel.archives-ouvertes.fr/tel-00551401.
Texto completoThéro, Héloïse. "Contrôle, agentivité et apprentissage par renforcement". Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE028/document.
Texto completoSense of agency or subjective control can be defined by the feeling that we control our actions, and through them effects in the outside world. This cluster of experiences depend on the ability to learn action-outcome contingencies and a more classical algorithm to model this originates in the field of human reinforcementlearning. In this PhD thesis, we used the cognitive modeling approach to investigate further the interaction between perceived control and reinforcement learning. First, we saw that participants undergoing a reinforcement-learning task experienced higher agency; this influence of reinforcement learning on agency comes as no surprise, because reinforcement learning relies on linking a voluntary action and its outcome. But our results also suggest that agency influences reinforcement learning in two ways. We found that people learn actionoutcome contingencies based on a default assumption: their actions make a difference to the world. Finally, we also found that the mere fact of choosing freely shapes the learning processes following that decision. Our general conclusion is that agency and reinforcement learning, two fundamental fields of human psychology, are deeply intertwined. Contrary to machines, humans do care about being in control, or about making the right choice, and this results in integrating information in a one-sided way
Munos, Rémi. "Apprentissage par renforcement, étude du cas continu". Paris, EHESS, 1997. http://www.theses.fr/1997EHESA021.
Texto completoLesner, Boris. "Planification et apprentissage par renforcement avec modèles d'actions compacts". Caen, 2011. http://www.theses.fr/2011CAEN2074.
Texto completoWe study Markovian Decision Processes represented with Probabilistic STRIPS action models. A first part of our work is about solving those processes in a compact way. To that end we propose two algorithms. A first one based on propositional formula manipulation allows to obtain approximate solutions in tractable propositional fragments such as Horn and 2-CNF. The second algorithm solves exactly and efficiently problems represented in PPDDL using a new notion of extended value functions. The second part is about learning such action models. We propose different approaches to solve the problem of ambiguous observations occurring while learning. Firstly, a heuristic method based on Linear Programming gives good results in practice yet without theoretical guarantees. We next describe a learning algorithm in the ``Know What It Knows'' framework. This approach gives strong theoretical guarantees on the quality of the learned models as well on the sample complexity. These two approaches are then put into a Reinforcement Learning setting to allow an empirical evaluation of their respective performances
Maillard, Odalric-Ambrym. "APPRENTISSAGE SÉQUENTIEL : Bandits, Statistique et Renforcement". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00845410.
Texto completoKlein, Édouard. "Contributions à l'apprentissage par renforcement inverse". Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0185/document.
Texto completoThis thesis, "Contributions à l'apprentissage par renforcement inverse", brings three major contributions to the community. The first one is a method for estimating the feature expectation, a quantity involved in most of state-of-the-art approaches which were thus extended to a batch off-policy setting. The second major contribution is an Inverse Reinforcement Learning algorithm, structured classification for inverse reinforcement learning (SCIRL), which relaxes a standard constraint in the field, the repeated solving of a Markov Decision Process, by introducing the temporal structure (using the feature expectation) of this process into a structured margin classification algorithm. The afferent theoritical guarantee and the good empirical performance it exhibited allowed it to be presentend in a good international conference: NIPS. Finally, the third contribution is cascaded supervised learning for inverse reinforcement learning (CSI) a method consisting in learning the expert's behavior via a supervised learning approach, and then introducing the temporal structure of the MDP via a regression involving the score function of the classifier. This method presents the same type of theoretical guarantee as SCIRL, but uses standard components for classification and regression, which makes its use simpler. This work will be presented in another good international conference: ECML
Libros sobre el tema "Apprentissage par renforcement mulitagent"
Sutton, Richard S. Reinforcement learning: An introduction. Cambridge, Mass: MIT Press, 1998.
Buscar texto completoOntario. Esquisse de cours 12e année: Sciences de l'activité physique pse4u cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Technologie de l'information en affaires btx4e cours préemploi. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Études informatiques ics4m cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Mathématiques de la technologie au collège mct4c cours précollégial. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Sciences snc4m cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: English eae4e cours préemploi. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Le Canada et le monde: une analyse géographique cgw4u cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Environnement et gestion des ressources cgr4e cours préemploi. Vanier, Ont: CFORP, 2002.
Buscar texto completoOntario. Esquisse de cours 12e année: Histoire de l'Occident et du monde chy4c cours précollégial. Vanier, Ont: CFORP, 2002.
Buscar texto completoCapítulos de libros sobre el tema "Apprentissage par renforcement mulitagent"
Tazdaït, Tarik y Rabia Nessah. "5. Vote et apprentissage par renforcement". En Le paradoxe du vote, 157–77. Éditions de l’École des hautes études en sciences sociales, 2013. http://dx.doi.org/10.4000/books.editionsehess.1931.
Texto completoBENDELLA, Mohammed Salih y Badr BENMAMMAR. "Impact de la radio cognitive sur le green networking : approche par apprentissage par renforcement". En Gestion du niveau de service dans les environnements émergents. ISTE Group, 2020. http://dx.doi.org/10.51926/iste.9002.ch8.
Texto completoInformes sobre el tema "Apprentissage par renforcement mulitagent"
Melloni, Gian. Le leadership des autorités locales en matière d'assainissement et d'hygiène : expériences et apprentissage de l'Afrique de l'Ouest. Institute of Development Studies (IDS), enero de 2022. http://dx.doi.org/10.19088/slh.2022.002.
Texto completo