Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Safe RL.

Artykuły w czasopismach na temat „Safe RL”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Safe RL”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Carr, Steven, Nils Jansen, Sebastian Junges, and Ufuk Topcu. "Safe Reinforcement Learning via Shielding under Partial Observability." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 14748–56. http://dx.doi.org/10.1609/aaai.v37i12.26723.

Pełny tekst źródła
Streszczenie:
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment. A family of approaches to this problem assume domain knowledge in the form of a (partial) model of this environment to decide upon the safety of an action. A so-called shield forces the RL agent to select only safe actions. However, for adoption in various applications, one must look beyond enforcing safety and also ensure the applicability of RL with good performance. We extend the applicability of shields via tight integration wit
Style APA, Harvard, Vancouver, ISO itp.
2

Ma, Yecheng Jason, Andrew Shen, Osbert Bastani, and Jayaraman Dinesh. "Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 5 (2022): 5404–12. http://dx.doi.org/10.1609/aaai.v36i5.20478.

Pełny tekst źródła
Streszczenie:
Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. Model-based RL algorithms hold promise for reducing unsafe real-world actions: they may synthesize policies that obey all constraints using simulated samples from a learned model. However, imperfect models can result in real-world constraint violations even for actions that are predicted to satisfy all constraints. We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncer
Style APA, Harvard, Vancouver, ISO itp.
3

Xu, Haoran, Xianyuan Zhan, and Xiangyu Zhu. "Constraints Penalized Q-learning for Safe Offline Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (2022): 8753–60. http://dx.doi.org/10.1609/aaai.v36i8.20855.

Pełny tekst źródła
Streszczenie:
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment. This problem is more appealing for real world RL applications, in which data collection is costly or dangerous. Enforcing constraint satisfaction is non-trivial, especially in offline settings, as there is a potential large discrepancy between the policy distribution and the data distribution, causing errors in estimating the value of safety constraints. We s
Style APA, Harvard, Vancouver, ISO itp.
4

Thananjeyan, Brijen, Ashwin Balakrishna, Suraj Nair, et al. "Recovery RL: Safe Reinforcement Learning With Learned Recovery Zones." IEEE Robotics and Automation Letters 6, no. 3 (2021): 4915–22. http://dx.doi.org/10.1109/lra.2021.3070252.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Serrano-Cuevas, Jonathan, Eduardo F. Morales, and Pablo Hernández-Leal. "Safe reinforcement learning using risk mapping by similarity." Adaptive Behavior 28, no. 4 (2019): 213–24. http://dx.doi.org/10.1177/1059712319859650.

Pełny tekst źródła
Streszczenie:
Reinforcement learning (RL) has been used to successfully solve sequential decision problem. However, considering risk at the same time as the learning process is an open research problem. In this work, we are interested in the type of risk that can lead to a catastrophic state. Related works that aim to deal with risk propose complex models. In contrast, we follow a simple, yet effective, idea: similar states might lead to similar risk. Using this idea, we propose risk mapping by similarity (RMS), an algorithm for discrete scenarios which infers the risk of newly discovered states by analyzin
Style APA, Harvard, Vancouver, ISO itp.
6

Cheng, Richard, Gábor Orosz, Richard M. Murray, and Joel W. Burdick. "End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3387–95. http://dx.doi.org/10.1609/aaai.v33i01.33013387.

Pełny tekst źródła
Streszczenie:
Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) online learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages th
Style APA, Harvard, Vancouver, ISO itp.
7

Jurj, Sorin Liviu, Dominik Grundt, Tino Werner, Philipp Borchers, Karina Rothemann, and Eike Möhlmann. "Increasing the Safety of Adaptive Cruise Control Using Physics-Guided Reinforcement Learning." Energies 14, no. 22 (2021): 7572. http://dx.doi.org/10.3390/en14227572.

Pełny tekst źródła
Streszczenie:
This paper presents a novel approach for improving the safety of vehicles equipped with Adaptive Cruise Control (ACC) by making use of Machine Learning (ML) and physical knowledge. More exactly, we train a Soft Actor-Critic (SAC) Reinforcement Learning (RL) algorithm that makes use of physical knowledge such as the jam-avoiding distance in order to automatically adjust the ideal longitudinal distance between the ego- and leading-vehicle, resulting in a safer solution. In our use case, the experimental results indicate that the physics-guided (PG) RL approach is better at avoiding collisions at
Style APA, Harvard, Vancouver, ISO itp.
8

Sakrihei, Helen. "Using automatic storage for ILL – experiences from the National Repository Library in Norway." Interlending & Document Supply 44, no. 1 (2016): 14–16. http://dx.doi.org/10.1108/ilds-11-2015-0035.

Pełny tekst źródła
Streszczenie:
Purpose – The purpose of this paper is to share the Norwegian Repository Library (RL)’s experiences with an automatic storage for interlibrary lending (ILL). Design/methodology/approach – This paper describes how the RL uses the automatic storage to deliver ILL services to Norwegian libraries. Chaos storage is the main principle for storage. Findings – Using automatic storage for ILL is efficient, cost-effective and safe. Originality/value – The RL has used automatic storage since 2003, and it is one of a few libraries using this technology.
Style APA, Harvard, Vancouver, ISO itp.
9

Ding, Yuhao, and Javad Lavaei. "Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 6 (2023): 7396–404. http://dx.doi.org/10.1609/aaai.v37i6.25900.

Pełny tekst źródła
Streszczenie:
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments. In this problem, the reward/utility functions and the state transition functions are both allowed to vary arbitrarily over time as long as their cumulative variations do not exceed certain known variation budgets. Designing safe RL algorithms in time-varying environments is particularly challenging because of the need to integrate the constraint vi
Style APA, Harvard, Vancouver, ISO itp.
10

Tubeuf, Carlotta, Felix Birkelbach, Anton Maly, and René Hofmann. "Increasing the Flexibility of Hydropower with Reinforcement Learning on a Digital Twin Platform." Energies 16, no. 4 (2023): 1796. http://dx.doi.org/10.3390/en16041796.

Pełny tekst źródła
Streszczenie:
The increasing demand for flexibility in hydropower systems requires pumped storage power plants to change operating modes and compensate reactive power more frequently. In this work, we demonstrate the potential of applying reinforcement learning (RL) to control the blow-out process of a hydraulic machine during pump start-up and when operating in synchronous condenser mode. Even though RL is a promising method that is currently getting much attention, safety concerns are stalling research on RL for the control of energy systems. Therefore, we present a concept that enables process control wi
Style APA, Harvard, Vancouver, ISO itp.
11

YOON, JAE UNG, and JUHONG LEE. "Uncertainty Sequence Modeling Approach for Safe and Effective Autonomous Driving." Korean Institute of Smart Media 11, no. 9 (2022): 9–20. http://dx.doi.org/10.30693/smj.2022.11.9.9.

Pełny tekst źródła
Streszczenie:
Deep reinforcement learning(RL) is an end-to-end data-driven control method that is widely used in the autonomous driving domain. However, conventional RL approaches have difficulties in applying it to autonomous driving tasks due to problems such as inefficiency, instability, and uncertainty. These issues play an important role in the autonomous driving domain. Although recent studies have attempted to solve these problems, they are computationally expensive and rely on special assumptions. In this paper, we propose a new algorithm MCDT that considers inefficiency, instability, and uncertaint
Style APA, Harvard, Vancouver, ISO itp.
12

Lin, Xingbin, Deyu Yuan, and Xifei Li. "Reinforcement Learning with Dual Safety Policies for Energy Savings in Building Energy Systems." Buildings 13, no. 3 (2023): 580. http://dx.doi.org/10.3390/buildings13030580.

Pełny tekst źródła
Streszczenie:
Reinforcement learning (RL) is being gradually applied in the control of heating, ventilation and air-conditioning (HVAC) systems to learn the optimal control sequences for energy savings. However, due to the “trial and error” issue, the output sequences of RL may cause potential operational safety issues when RL is applied in real systems. To solve those problems, an RL algorithm with dual safety policies for energy savings in HVAC systems is proposed. In the proposed dual safety policies, the implicit safety policy is a part of the RL model, which integrates safety into the optimization targ
Style APA, Harvard, Vancouver, ISO itp.
13

Marchesini, Enrico, Davide Corsi, and Alessandro Farinelli. "Exploring Safer Behaviors for Deep Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (2022): 7701–9. http://dx.doi.org/10.1609/aaai.v36i7.20737.

Pełny tekst źródła
Streszczenie:
We consider Reinforcement Learning (RL) problems where an agent attempts to maximize a reward signal while minimizing a cost function that models unsafe behaviors. Such formalization is addressed in the literature using constrained optimization on the cost, limiting the exploration and leading to a significant trade-off between cost and reward. In contrast, we propose a Safety-Oriented Search that complements Deep RL algorithms to bias the policy toward safety within an evolutionary cost optimization. We leverage evolutionary exploration benefits to design a novel concept of safe mutations tha
Style APA, Harvard, Vancouver, ISO itp.
14

Egleston, David, Patricia Ann Castelli, and Thomas George Marx. "Developing, validating, and testing a model of reflective leadership." Leadership & Organization Development Journal 38, no. 7 (2017): 886–96. http://dx.doi.org/10.1108/lodj-09-2016-0230.

Pełny tekst źródła
Streszczenie:
Purpose The purpose of this paper is to develop, validate, and test the impacts of reflective leadership (RL) on organizational performance. Design/methodology/approach This is an empirical study based on over 700 survey responses from business leaders around the world. An instrument was developed to validate the model, and the statistical significance of its impacts on organizational performance was tested. Findings The findings show that a model of RL consisting of three leadership practices, creating an open and safe work environment, defining purpose, and challenging assumptions had signif
Style APA, Harvard, Vancouver, ISO itp.
15

Huh, Gene, and Wonjae Cha. "Development and Clinical Application of Real-Time Light-Guided Vocal Fold Injection." Journal of The Korean Society of Laryngology, Phoniatrics and Logopedics 33, no. 1 (2022): 1–6. http://dx.doi.org/10.22469/jkslp.2022.33.1.1.

Pełny tekst źródła
Streszczenie:
Vocal fold injection (VFI) is widely accepted as a first line treatment in treating unilateral vocal fold paralysis and other vocal fold diseases. Although VFI is advantageous for its minimal invasiveness and efficiency, the invisibility of the needle tip remains an essential handicap in precise localization. Real-time light-guided vocal fold injection (RL-VFI) is a novel technique that was developed under the concept of performing simultaneous injection with precise placement of the needle tip under light guidance. RL-VFI has confirmed its possibility of technical implementation and the feasi
Style APA, Harvard, Vancouver, ISO itp.
16

Ramakrishnan, Ramya, Ece Kamar, Debadeepta Dey, Eric Horvitz, and Julie Shah. "Blind Spot Detection for Safe Sim-to-Real Transfer." Journal of Artificial Intelligence Research 67 (February 4, 2020): 191–234. http://dx.doi.org/10.1613/jair.1.11436.

Pełny tekst źródła
Streszczenie:
Agents trained in simulation may make errors when performing actions in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult for the agent to discover because the agent is unable to predict them a priori. In this work, we propose the use of oracle feedback to learn a predictive model of these blind spots in order to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: when the agent lacks necessary features to represent the
Style APA, Harvard, Vancouver, ISO itp.
17

Hao, Hao, Yichen Sun, Xueyun Mei, and Yanjun Zhou. "Reverse Logistics Network Design of Electric Vehicle Batteries considering Recall Risk." Mathematical Problems in Engineering 2021 (August 18, 2021): 1–16. http://dx.doi.org/10.1155/2021/5518049.

Pełny tekst źródła
Streszczenie:
In 2018-2019, the recall scale of electric vehicles (EVs) in China reached 168,700 units; recalls account for approximately 6.9% of sales volume. There are imperative reasons for electric vehicle batteries (EVBs) recalls, such as mandatory laws or policies, safety and environmental pollution risks, and the high value of EVB echelon use, and thus, it has become increasingly important to reasonably design a reverse logistics (RL) network for an EVB recall. In this study, a multiobjective and multiperiod recall RL network model is developed to minimize safety and environmental risks, maximize the
Style APA, Harvard, Vancouver, ISO itp.
18

Ray, Kaustabha, and Ansuman Banerjee. "Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning." ACM Transactions on Embedded Computing Systems 20, no. 6 (2021): 1–33. http://dx.doi.org/10.1145/3475991.

Pełny tekst źródła
Streszczenie:
Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to e
Style APA, Harvard, Vancouver, ISO itp.
19

Delgado, Tomás, Marco Sánchez Sorondo, Víctor Braberman, and Sebastián Uchitel. "Exploration Policies for On-the-Fly Controller Synthesis: A Reinforcement Learning Approach." Proceedings of the International Conference on Automated Planning and Scheduling 33, no. 1 (2023): 569–77. http://dx.doi.org/10.1609/icaps.v33i1.27238.

Pełny tekst źródła
Streszczenie:
Controller synthesis is in essence a case of model-based planning for non-deterministic environments in which plans (actually “strategies”) are meant to preserve system goals indefinitely. In the case of supervisory control environments are specified as the parallel composition of state machines and valid strategies are required to be “non-blocking” (i.e., always enabling the environment to reach certain marked states) in addition to safe (i.e., keep the system within a safe zone). Recently, On-the-fly Directed Controller Synthesis techniques were proposed to avoid the exploration of the entir
Style APA, Harvard, Vancouver, ISO itp.
20

Bolster, Lauren, Mark Bosch, Brian Brownbridge, and Anurag Saxena. "RAP Trial: Ringer's Lactate and Packed Red Blood Cell Transfusion, An in Vitro Study and Chart Review." Blood 114, no. 22 (2009): 2105. http://dx.doi.org/10.1182/blood.v114.22.2105.2105.

Pełny tekst źródła
Streszczenie:
Abstract Abstract 2105 Poster Board II-82 Background: The Canadian Blood Services and the American Association of Blood Banks state that intravenous solution administered with packed red blood cells (pRBC) must be isotonic and must not contain calcium or glucose. This recommendation is based on in vitro investigations demonstrating that calcium containing solutions can initiate in vitro coagulation in citrated blood (Ryden 1975, Dickson 1980, Lorenzo 1998). Recently this recommendation has been challenged by in vitro studies that combined AS-3 pRBC with Ringer's Lactate (RL) (Albert 2009). Cur
Style APA, Harvard, Vancouver, ISO itp.
21

Romey, Aurore, Hussaini G. Ularamu, Abdulnaci Bulut, et al. "Field Evaluation of a Safe, Easy, and Low-Cost Protocol for Shipment of Samples from Suspected Cases of Foot-and-Mouth Disease to Diagnostic Laboratories." Transboundary and Emerging Diseases 2023 (August 5, 2023): 1–15. http://dx.doi.org/10.1155/2023/9555213.

Pełny tekst źródła
Streszczenie:
Identification and characterization of the foot-and-mouth disease virus (FMDV) strains circulating in endemic countries and their dynamics are essential elements of the global FMD control strategy. Characterization of FMDV is usually performed in reference laboratories (RL). However, shipping of FMD samples to RL is a challenge due to the cost and biosafety requirements of transportation, resulting in a lack of knowledge about the strains circulating in some endemic areas. In order to simplify this step and to encourage sample submission to RL, we have previously developed a low-cost protocol
Style APA, Harvard, Vancouver, ISO itp.
22

Dai, Juntao, Jiaming Ji, Long Yang, Qian Zheng, and Gang Pan. "Augmented Proximal Policy Optimization for Safe Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 6 (2023): 7288–95. http://dx.doi.org/10.1609/aaai.v37i6.25888.

Pełny tekst źródła
Streszczenie:
Safe reinforcement learning considers practical scenarios that maximize the return while satisfying safety constraints. Current algorithms, which suffer from training oscillations or approximation errors, still struggle to update the policy efficiently with precise constraint satisfaction. In this article, we propose Augmented Proximal Policy Optimization (APPO), which augments the Lagrangian function of the primal constrained problem via attaching a quadratic deviation term. The constructed multiplier-penalty function dampens cost oscillation for stable convergence while being equivalent to t
Style APA, Harvard, Vancouver, ISO itp.
23

Krstić, Mladen, Giulio Paolo Agnusdei, Pier Paolo Miglietta, Snežana Tadić, and Violeta Roso. "Applicability of Industry 4.0 Technologies in the Reverse Logistics: A Circular Economy Approach Based on COmprehensive Distance Based RAnking (COBRA) Method." Sustainability 14, no. 9 (2022): 5632. http://dx.doi.org/10.3390/su14095632.

Pełny tekst źródła
Streszczenie:
The logistics sector plays one of the most important roles in the supply chain with the aim of providing a fast, flexible, safe, economical, efficient, and environmentally acceptable performance of freight transport flows. In addition, the popularization of the concept of a circular economy (CE) used to retain goods, components, and materials at their highest usability and value at all times, illustrates the importance of the adequate performance of reverse logistics (RL) processes. However, traditional RL is unable to cope with the requirements of modern supply chains and requires the applica
Style APA, Harvard, Vancouver, ISO itp.
24

Prasetyo, Risky Vitria, Abdul Latief Azis, and Soegeng Soegijanto. "Comparison of the efficacy and safety of hydroxyethyl starch 130/0.4 and Ringer's lactate in children with grade III dengue hemorrhagic fever." Paediatrica Indonesiana 49, no. 2 (2009): 97. http://dx.doi.org/10.14238/pi49.2.2009.97-103.

Pełny tekst źródła
Streszczenie:
Background Theoretically hydroxyethyl starch (HES) will givemore rapid recovery from shock, including in dengue shocksyndrome (DSS) and currently gained popularity for its lessdeleterious effects on renal function and blood coagulation.Objectives To compare the efficacy and safety ofHES 130/0.4 andRinger's lactate (RL) for shock recovery in children with DSS.Methods A randomized controlled study was performed on 39children admitted with DSS at Dr. Soetomo Hospital, Surabaya,between March and May 2007. Children were grouped intograde III (n=25) and grade IV (n=14) dengue hemorrhagicfever (DHF)
Style APA, Harvard, Vancouver, ISO itp.
25

Böck, Markus, Julien Malle, Daniel Pasterk, Hrvoje Kukina, Ramin Hasani, and Clemens Heitzinger. "Superhuman performance on sepsis MIMIC-III data by distributional reinforcement learning." PLOS ONE 17, no. 11 (2022): e0275358. http://dx.doi.org/10.1371/journal.pone.0275358.

Pełny tekst źródła
Streszczenie:
We present a novel setup for treating sepsis using distributional reinforcement learning (RL). Sepsis is a life-threatening medical emergency. Its treatment is considered to be a challenging high-stakes decision-making problem, which has to procedurally account for risk. Treating sepsis by machine learning algorithms is difficult due to a couple of reasons: There is limited and error-afflicted initial data in a highly complex biological system combined with the need to make robust, transparent and safe decisions. We demonstrate a suitable method that combines data imputation by a kNN model usi
Style APA, Harvard, Vancouver, ISO itp.
26

Li, Yue, Xiao Yong Bai, Shi Jie Wang, Luo Yi Qin, Yi Chao Tian, and Guang Jie Luo. "Evaluating of the spatial heterogeneity of soil loss tolerance and its effects on erosion risk in the carbonate areas of southern China." Solid Earth 8, no. 3 (2017): 661–69. http://dx.doi.org/10.5194/se-8-661-2017.

Pełny tekst źródła
Streszczenie:
Abstract. Soil loss tolerance (T value) is one of the criteria in determining the necessity of erosion control measures and ecological restoration strategy. However, the validity of this criterion in subtropical karst regions is strongly disputed. In this study, T value is calculated based on soil formation rate by using a digital distribution map of carbonate rock assemblage types. Results indicated a spatial heterogeneity and diversity in soil loss tolerance. Instead of only one criterion, a minimum of three criteria should be considered when investigating the carbonate areas of southern Chi
Style APA, Harvard, Vancouver, ISO itp.
27

Kondrup, Flemming, Thomas Jiralerspong, Elaine Lau, et al. "Towards Safe Mechanical Ventilation Treatment Using Deep Offline Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (2023): 15696–702. http://dx.doi.org/10.1609/aaai.v37i13.26862.

Pełny tekst źródła
Streszczenie:
Mechanical ventilation is a key form of life support for patients with pulmonary impairment. Healthcare workers are required to continuously adjust ventilator settings for each patient, a challenging and time consuming task. Hence, it would be beneficial to develop an automated decision support tool to optimize ventilation treatment. We present DeepVent, a Conservative Q-Learning (CQL) based offline Deep Reinforcement Learning (DRL) agent that learns to predict the optimal ventilator parameters for a patient to promote 90 day survival. We design a clinically relevant intermediate reward that e
Style APA, Harvard, Vancouver, ISO itp.
28

Miyajima, Hirofumi, Noritaka Shigei, Syunki Makino, et al. "A proposal of privacy preserving reinforcement learning for secure multiparty computation." Artificial Intelligence Research 6, no. 2 (2017): 57. http://dx.doi.org/10.5430/air.v6n2p57.

Pełny tekst źródła
Streszczenie:
Many studies have been done with the security of cloud computing. Though data encryption is a typical approach, high computing complexity for encryption and decryption of data is needed. Therefore, safe system for distributed processing with secure data attracts attention, and a lot of studies have been done. Secure multiparty computation (SMC) is one of these methods. Specifically, two learning methods for machine learning (ML) with SMC are known. One is to divide learning data into several subsets and perform learning. The other is to divide each item of learning data and perform learning. S
Style APA, Harvard, Vancouver, ISO itp.
29

Thananjeyan, Brijen, Ashwin Balakrishna, Ugo Rosolia, et al. "Safety Augmented Value Estimation From Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks." IEEE Robotics and Automation Letters 5, no. 2 (2020): 3612–19. http://dx.doi.org/10.1109/lra.2020.2976272.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
30

Ren, Tianzhu, Yuanchang Xie, and Liming Jiang. "Cooperative Highway Work Zone Merge Control Based on Reinforcement Learning in a Connected and Automated Environment." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 10 (2020): 363–74. http://dx.doi.org/10.1177/0361198120935873.

Pełny tekst źródła
Streszczenie:
Given the aging infrastructure and the anticipated growing number of highway work zones in the U.S.A., it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed method assumes that all vehicles are fully automated, connected, and cooperative. It inserts two metering zones in the open lane to make space for merging vehicles in the closed lane. In addition, each vehicl
Style APA, Harvard, Vancouver, ISO itp.
31

Reda, Ahmad, and József Vásárhelyi. "Design and Implementation of Reinforcement Learning for Automated Driving Compared to Classical MPC Control." Designs 7, no. 1 (2023): 18. http://dx.doi.org/10.3390/designs7010018.

Pełny tekst źródła
Streszczenie:
Many classic control approaches have already proved their merits in the automotive industry. Model predictive control (MPC) is one of the most commonly used methods. However, its efficiency drops off with increase in complexity of the driving environment. Recently, machine learning methods have been considered an efficient alternative to classical control approaches. Even with successful implementation of reinforcement learning in real-world applications, it is still not commonly used compared to supervised and unsupervised learning. In this paper, a reinforcement learning (RL)-based framework
Style APA, Harvard, Vancouver, ISO itp.
32

Gardille, Arnaud, and Ola Ahmad. "Towards Safe Reinforcement Learning via OOD Dynamics Detection in Autonomous Driving System (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (2023): 16216–17. http://dx.doi.org/10.1609/aaai.v37i13.26968.

Pełny tekst źródła
Streszczenie:
Deep reinforcement learning (DRL) has proven effective in training agents to achieve goals in complex environments. However, a trained RL agent may exhibit, during deployment, unexpected behavior when faced with a situation where its state transitions differ even slightly from the training environment. Such a situation can arise for a variety of reasons. Rapid and accurate detection of anomalous behavior appears to be a prerequisite for using DRL in safety-critical systems, such as autonomous driving. We propose a novel OOD detection algorithm based on modeling the transition function of the t
Style APA, Harvard, Vancouver, ISO itp.
33

Free, David. "In the News." College & Research Libraries News 80, no. 10 (2019): 541. http://dx.doi.org/10.5860/crln.80.10.541.

Pełny tekst źródła
Streszczenie:
Welcome to the November 2019 issue of C&RL News. Many academic libraries have begun focusing efforts on addressing the mental health and well being of their populations. Marshall University in West Virginia, one of the states hit hardest by the recent opioid crises, focused on their libraries as mental health safe spaces. Sabrina Thomas and Kacy Lovelace discuss their collaborative campus project in “Combining efforts.” Learn more about resources available for “Mental health awareness” in this month’s Internet Resources article by Emily Underwood.
Style APA, Harvard, Vancouver, ISO itp.
34

Xu, Xibao, Yushen Chen, and Chengchao Bai. "Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing." Sensors 21, no. 23 (2021): 8161. http://dx.doi.org/10.3390/s21238161.

Pełny tekst źródła
Streszczenie:
Planetary soft landing has been studied extensively due to its promising application prospects. In this paper, a soft landing control algorithm based on deep reinforcement learning (DRL) with good convergence property is proposed. First, the soft landing problem of the powered descent phase is formulated and the theoretical basis of Reinforcement Learning (RL) used in this paper is introduced. Second, to make it easier to converge, a reward function is designed to include process rewards like velocity tracking reward, solving the problem of sparse reward. Then, by including the fuel consumptio
Style APA, Harvard, Vancouver, ISO itp.
35

Simão, Thiago D., Marnix Suilen, and Nils Jansen. "Safe Policy Improvement for POMDPs via Finite-State Controllers." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 15109–17. http://dx.doi.org/10.1609/aaai.v37i12.26763.

Pełny tekst źródła
Streszczenie:
We study safe policy improvement (SPI) for partially observable Markov decision processes (POMDPs). SPI is an offline reinforcement learning (RL) problem that assumes access to (1) historical data about an environment, and (2) the so-called behavior policy that previously generated this data by interacting with the environment. SPI methods neither require access to a model nor the environment itself, and aim to reliably improve upon the behavior policy in an offline manner. Existing methods make the strong assumption that the environment is fully observable. In our novel approach to the SPI pr
Style APA, Harvard, Vancouver, ISO itp.
36

Zhang, Linrui, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang, and Dacheng Tao. "Evaluating Model-Free Reinforcement Learning toward Safety-Critical Tasks." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 15313–21. http://dx.doi.org/10.1609/aaai.v37i12.26786.

Pełny tekst źródła
Streszczenie:
Safety comes first in many real-world applications involving autonomous agents. Despite a large number of reinforcement learning (RL) methods focusing on safety-critical tasks, there is still a lack of high-quality evaluation of those algorithms that adheres to safety constraints at each decision step under complex and unknown dynamics. In this paper, we revisit prior work in this scope from the perspective of state-wise safe RL and categorize them as projection-based, recovery-based, and optimization-based approaches, respectively. Furthermore, we propose Unrolling Safety Layer (USL), a joint
Style APA, Harvard, Vancouver, ISO itp.
37

Angele, Martin K., Nadia Smail, Markus W. Knöferl, Alfred Ayala, William G. Cioffi, and Irshad H. Chaudry. "l-Arginine restores splenocyte functions after trauma and hemorrhage potentially by improving splenic blood flow." American Journal of Physiology-Cell Physiology 276, no. 1 (1999): C145—C151. http://dx.doi.org/10.1152/ajpcell.1999.276.1.c145.

Pełny tekst źródła
Streszczenie:
Several studies indicate that immune responses are markedly depressed early after onset of hemorrhage. Decreased organ blood flow has been implicated in the pathophysiology of altered immune responses after trauma-hemorrhage. In this regard, administration ofl-arginine has been shown to restore depressed intestinal and hepatic blood flow after trauma-hemorrhage, probably due to provision of substrate for constitutive nitric oxide synthase (cNOS). It remains unknown, however, whether administration ofl-arginine also ameliorates depressed splenic blood flow and whether this agent has any salutar
Style APA, Harvard, Vancouver, ISO itp.
38

Staessens, Tom, Tom Lefebvre, and Guillaume Crevecoeur. "Optimizing Cascaded Control of Mechatronic Systems through Constrained Residual Reinforcement Learning." Machines 11, no. 3 (2023): 402. http://dx.doi.org/10.3390/machines11030402.

Pełny tekst źródła
Streszczenie:
Cascaded control structures are prevalent in industrial systems with many disturbances to obtain stable control but are cumbersome and challenging to tune. In this work, we propose cascaded constrained residual reinforcement learning (RL), an intuitive method that allows to improve the performance of a cascaded control structure while maintaining safe operation at all times. We draw inspiration from the constrained residual RL framework, in which a constrained reinforcement learning agent learns corrective adaptations to a base controller’s output to increase optimality. We first revisit the i
Style APA, Harvard, Vancouver, ISO itp.
39

Lv, Kexuan, Xiaofei Pei, Ci Chen, and Jie Xu. "A Safe and Efficient Lane Change Decision-Making Strategy of Autonomous Driving Based on Deep Reinforcement Learning." Mathematics 10, no. 9 (2022): 1551. http://dx.doi.org/10.3390/math10091551.

Pełny tekst źródła
Streszczenie:
As an indispensable branch of machine learning (ML), reinforcement learning (RL) plays a prominent role in the decision-making process of autonomous driving (AD), which enables autonomous vehicles (AVs) to learn an optimal driving strategy through continuous interaction with the environment. This paper proposes a deep reinforcement learning (DRL)-based motion planning strategy for AD tasks in the highway scenarios where an AV merges into two-lane road traffic flow and realizes the lane changing (LC) maneuvers. We integrate the DRL model into the AD system relying on the end-to-end learning met
Style APA, Harvard, Vancouver, ISO itp.
40

Jurj, Sorin Liviu, Tino Werner, Dominik Grundt, Willem Hagemann, and Eike Möhlmann. "Towards Safe and Sustainable Autonomous Vehicles Using Environmentally-Friendly Criticality Metrics." Sustainability 14, no. 12 (2022): 6988. http://dx.doi.org/10.3390/su14126988.

Pełny tekst źródła
Streszczenie:
This paper presents a mathematical analysis of several criticality metrics used for evaluating the safety of autonomous vehicles (AVs) and also proposes novel environmentally-friendly metrics with the scope of facilitating their selection by future researchers who want to evaluate both safety and the environmental impact of AVs. Regarding this, first, we investigate whether the criticality metrics which are used to quantify the severeness of critical situations in autonomous driving are well-defined and work as intended. In some cases, the well-definedness or the intendedness of the metrics wi
Style APA, Harvard, Vancouver, ISO itp.
41

Maw, Aye Aye, Maxim Tyan, Tuan Anh Nguyen, and Jae-Woo Lee. "iADA*-RL: Anytime Graph-Based Path Planning with Deep Reinforcement Learning for an Autonomous UAV." Applied Sciences 11, no. 9 (2021): 3948. http://dx.doi.org/10.3390/app11093948.

Pełny tekst źródła
Streszczenie:
Path planning algorithms are of paramount importance in guidance and collision systems to provide trustworthiness and safety for operations of autonomous unmanned aerial vehicles (UAV). Previous works showed different approaches mostly focusing on shortest path discovery without a sufficient consideration on local planning and collision avoidance. In this paper, we propose a hybrid path planning algorithm that uses an anytime graph-based path planning algorithm for global planning and deep reinforcement learning for local planning which applied for a real-time mission planning system of an aut
Style APA, Harvard, Vancouver, ISO itp.
42

Civetta, Joseph M., and Charles L. Fox. "Advantages of Resuscitation with Balanced Hypertonic Sodium Solution in Disasters." Prehospital and Disaster Medicine 1, S1 (1985): 179–80. http://dx.doi.org/10.1017/s1049023x0004437x.

Pełny tekst źródła
Streszczenie:
Resuscitation in disasters must be effective, prompt, safe and uncomplicated. Clinical experience in severe, extensive thermal burns in numerous clinics has shown that balanced hypertonic sodium solution (BHSS) can achieve effective resuscitation with: administration of less volume of fluid; early onset of excretion of sodium-containing urine; less generalized edema and without pulmonary edema. This experience is now being transferred to patients after trauma and major surgical procedures often complicated by peritonitis. In an ongoing study of randomly selected adults following surgical traum
Style APA, Harvard, Vancouver, ISO itp.
43

Wysocka, B. A., Z. Kassam, G. Lockwood, J. Brierley, L. Dawson, and J. Ringash. "Assessment of intra and interfractional organ motion during adjuvant radiochemotherapy in gastric cancer." Journal of Clinical Oncology 25, no. 18_suppl (2007): 15132. http://dx.doi.org/10.1200/jco.2007.25.18_suppl.15132.

Pełny tekst źródła
Streszczenie:
15132 Background: Adjuvant combined chemotherapy and radiotherapy (RT) in gastric cancer improves survival, however acute toxicity is substantial. Toxicity may be improved with three-dimensional (3D) RT, but organ motion must be considered in planning target volume (PTV) delineation. Methods: Participants (n=22) had baseline free breathing planning CT (CT0) with BodyFix immobilization. Abdominal CTs in free breathing (FB), inhale (I) and exhale (E) states were obtained in weeks 1, 3 and 5 of RT. Datasets were fused to CT0 in Pinnacle3 6.2 planning system using bone registration. Volumes of int
Style APA, Harvard, Vancouver, ISO itp.
44

Niu, Tong, and Mohit Bansal. "AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (2020): 8560–67. http://dx.doi.org/10.1609/aaai.v34i05.6378.

Pełny tekst źródła
Streszczenie:
Many sequence-to-sequence dialogue models tend to generate safe, uninformative responses. There have been various useful efforts on trying to eliminate them. However, these approaches either improve decoding algorithms during inference, rely on hand-crafted features, or employ complex models. In our work, we build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering. Specifically, we start with a simple yet effective automatic metric, AvgOut, which calculates the average output probability distribution of all time steps on the decoder
Style APA, Harvard, Vancouver, ISO itp.
45

Vivek, Kumar, Shah Amiti, Saha Shivshankar, and Choudhary Lalit. "Electrolyte and Haemogram changes post large volume liposuction comparing two different tumescent solutions." Indian Journal of Plastic Surgery 47, no. 03 (2014): 386–93. http://dx.doi.org/10.4103/0970-0358.146604.

Pełny tekst źródła
Streszczenie:
ABSTRACT Background: The most common definitions of large volume liposuction refer to total 5 l volume aspiration during a single procedure (fat plus wetting solution). Profound haemodynamic and metabolic alterations can accompany large volume liposuction. Due to paucity of literature on the effect of different tumescent solutions on the electrolyte balance and haematological changes during large volume liposuction, we carried out this study using two different wetting solutions to study the same. Materials and Methods: Total 30 patients presenting with varying degrees of localized lipodystrop
Style APA, Harvard, Vancouver, ISO itp.
46

Chebbi, Alif, Massimiliano Tazzari, Cristiana Rizzi, et al. "Burkholderia thailandensis E264 as a promising safe rhamnolipids’ producer towards a sustainable valorization of grape marcs and olive mill pomace." Applied Microbiology and Biotechnology 105, no. 9 (2021): 3825–42. http://dx.doi.org/10.1007/s00253-021-11292-0.

Pełny tekst źródła
Streszczenie:
Abstract Within the circular economy framework, our study aims to assess the rhamnolipid production from winery and olive oil residues as low-cost carbon sources by nonpathogenic strains. After evaluating various agricultural residues from those two sectors, Burkholderia thailandensis E264 was found to use the raw soluble fraction of nonfermented (white) grape marcs (NF), as the sole carbon and energy source, and simultaneously, reducing the surface tension to around 35 mN/m. Interestingly, this strain showed a rhamnolipid production up to 1070 mg/L (13.37 mg/g of NF), with a higher purity, on
Style APA, Harvard, Vancouver, ISO itp.
47

Brown, Jennifer R., Matthew S. Davids, Jordi Rodon, et al. "Update On The Safety and Efficacy Of The Pan Class I PI3K Inhibitor SAR245408 (XL147) In Chronic Lymphocytic Leukemia and Non-Hodgkin’s Lymphoma Patients." Blood 122, no. 21 (2013): 4170. http://dx.doi.org/10.1182/blood.v122.21.4170.4170.

Pełny tekst źródła
Streszczenie:
Abstract Background Constitutive activation of phosphatidylinositol 3-kinase (PI3K)/ mammalian target of rapamycin (mTOR) pathway by various mechanisms has been implicated in the pathogenesis of chronic lymphocytic leukemia (CLL) and non-Hodgkin’s lymphoma (NHL). There is mounting evidence suggesting that along with PI3Kγ, PI3Kα may be involved in CLL/NHL. SAR245408 is a potent and selective inhibitor of all α, γ and δ class I PI3K isoforms. It has been shown to inhibit PI3K signaling and impact tumor growth in preclinical tumor models. The impact of SAR245408 on safety, tolerability, pharmaco
Style APA, Harvard, Vancouver, ISO itp.
48

Tripathi, Malati, Ayushma Adhikari, and Bibhushan Neupane. "Misoprostol Versus Oxytocin for Induction of Labour at Term and Post Term Pregnancy of Primigravida." Journal of Universal College of Medical Sciences 6, no. 2 (2018): 56–59. http://dx.doi.org/10.3126/jucms.v6i2.22497.

Pełny tekst źródła
Streszczenie:
Introduction: To compare effectiveness and safety of sublingually administered misoprostol and intravenously infused 10 units of oxytocin for labor induction at term and post term pregnant women in Gandaki Medical College Teaching Hospital (GMCTH).
 Materials and methods: This is a prospective study conducted in Department of Obstetrics and Gynaecology in Gandaki Medical College and performed on 120 patients of primigravida with cephalic presentation at term and post-term pregnancy. Patients were given 50µg sublingual misoprostol 6 hourly (two doses) and 5 units of oxytocin in 500ml RL st
Style APA, Harvard, Vancouver, ISO itp.
49

Olupot-Olupot, Peter, Florence Aloroker, Ayub Mpoya, et al. "Gastroenteritis Rehydration Of children with Severe Acute Malnutrition (GASTROSAM): A Phase II Randomised Controlled trial: Trial Protocol." Wellcome Open Research 6 (June 23, 2021): 160. http://dx.doi.org/10.12688/wellcomeopenres.16885.1.

Pełny tekst źródła
Streszczenie:
Background: Children hospitalised with severe acute malnutrition (SAM) are frequently complicated (>50%) by diarrhoea (≥3 watery stools/day) which is accompanied by poor outcomes. Rehydration guidelines for SAM are exceptionally conservative and controversial, based upon expert opinion. The guidelines only permit use of intravenous fluids for cases with advanced shock and exclusive use of low sodium intravenous and oral rehydration solutions (ORS) for fear of fluid and/or sodium overload. Children managed in accordance to these guidelines have a very high mortality. The proposed GASTROSAM t
Style APA, Harvard, Vancouver, ISO itp.
50

Jiang, Jianhua, Yangang Ren, Yang Guan, et al. "Integrated decision and control at multi-lane intersections with mixed traffic flow." Journal of Physics: Conference Series 2234, no. 1 (2022): 012015. http://dx.doi.org/10.1088/1742-6596/2234/1/012015.

Pełny tekst źródła
Streszczenie:
Abstract Autonomous driving at intersections is one of the most complicated and accident-prone traffic scenarios, especially with mixed traffic participants such as vehicles, bicycles and pedestrians. The driving policy should make safe decisions to handle the dynamic traffic conditions and meet the requirements of on-board computation. However, most of the current researches focuses on simplified intersections considering only the surrounding vehicles and idealized traffic lights. This paper improves the integrated decision and control framework and develops a learning-based algorithm to deal
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!