Acceder

Bibliografías temáticas / Safe RL / Artículos de revistas

Artículos de revistas sobre el tema "Safe RL"

Siga este enlace para ver otros tipos de publicaciones sobre el tema: Safe RL.

Autor: Grafiati

Publicado: 6 de septiembre de 2023

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "Safe RL".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.

1

Carr, Steven, Nils Jansen, Sebastian Junges y Ufuk Topcu. "Safe Reinforcement Learning via Shielding under Partial Observability". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 12 (26 de junio de 2023): 14748–56. http://dx.doi.org/10.1609/aaai.v37i12.26723.

Texto completo

Resumen

Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment. A family of approaches to this problem assume domain knowledge in the form of a (partial) model of this environment to decide upon the safety of an action. A so-called shield forces the RL agent to select only safe actions. However, for adoption in various applications, one must look beyond enforcing safety and also ensure the applicability of RL with good performance. We extend the applicability of shields via tight integration with state-of-the-art deep RL, and provide an extensive, empirical study in challenging, sparse-reward environments under partial observability. We show that a carefully integrated shield ensures safety and can improve the convergence rate and final performance of RL agents. We furthermore show that a shield can be used to bootstrap state-of-the-art RL agents: they remain safe after initial learning in a shielded setting, allowing us to disable a potentially too conservative shield eventually.

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Ma, Yecheng Jason, Andrew Shen, Osbert Bastani y Jayaraman Dinesh. "Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 5 (28 de junio de 2022): 5404–12. http://dx.doi.org/10.1609/aaai.v36i5.20478.

Texto completo

Resumen

Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. Model-based RL algorithms hold promise for reducing unsafe real-world actions: they may synthesize policies that obey all constraints using simulated samples from a learned model. However, imperfect models can result in real-world constraint violations even for actions that are predicted to satisfy all constraints. We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncertainty and adaptively exploiting it to balance the reward and the cost objectives. First, CAP inflates predicted costs using an uncertainty-based penalty. Theoretically, we show that policies that satisfy this conservative cost constraint are guaranteed to also be feasible in the true environment. We further show that this guarantees the safety of all intermediate solutions during RL training. Further, CAP adaptively tunes this penalty during training using true cost feedback from the environment. We evaluate this conservative and adaptive penalty-based approach for model-based safe RL extensively on state and image-based environments. Our results demonstrate substantial gains in sample-efficiency while incurring fewer violations than prior safe RL algorithms. Code is available at: https://github.com/Redrew/CAP

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Xu, Haoran, Xianyuan Zhan y Xiangyu Zhu. "Constraints Penalized Q-learning for Safe Offline Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 8 (28 de junio de 2022): 8753–60. http://dx.doi.org/10.1609/aaai.v36i8.20855.

Texto completo

Resumen

We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment. This problem is more appealing for real world RL applications, in which data collection is costly or dangerous. Enforcing constraint satisfaction is non-trivial, especially in offline settings, as there is a potential large discrepancy between the policy distribution and the data distribution, causing errors in estimating the value of safety constraints. We show that naïve approaches that combine techniques from safe RL and offline RL can only learn sub-optimal solutions. We thus develop a simple yet effective algorithm, Constraints Penalized Q-Learning (CPQ), to solve the problem. Our method admits the use of data generated by mixed behavior policies. We present a theoretical analysis and demonstrate empirically that our approach can learn robustly across a variety of benchmark control tasks, outperforming several baselines.

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Thananjeyan, Brijen, Ashwin Balakrishna, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E. Gonzalez, Julian Ibarz, Chelsea Finn y Ken Goldberg. "Recovery RL: Safe Reinforcement Learning With Learned Recovery Zones". IEEE Robotics and Automation Letters 6, n.º 3 (julio de 2021): 4915–22. http://dx.doi.org/10.1109/lra.2021.3070252.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Serrano-Cuevas, Jonathan, Eduardo F. Morales y Pablo Hernández-Leal. "Safe reinforcement learning using risk mapping by similarity". Adaptive Behavior 28, n.º 4 (18 de julio de 2019): 213–24. http://dx.doi.org/10.1177/1059712319859650.

Texto completo

Resumen

Reinforcement learning (RL) has been used to successfully solve sequential decision problem. However, considering risk at the same time as the learning process is an open research problem. In this work, we are interested in the type of risk that can lead to a catastrophic state. Related works that aim to deal with risk propose complex models. In contrast, we follow a simple, yet effective, idea: similar states might lead to similar risk. Using this idea, we propose risk mapping by similarity (RMS), an algorithm for discrete scenarios which infers the risk of newly discovered states by analyzing how similar they are to previously known risky states. In general terms, the RMS algorithm transfers the knowledge gathered by the agent regarding the risk to newly discovered states. We contribute with a new approach to consider risk based on similarity and with RMS, which is simple and generalizable as long as the premise similar states yield similar risk holds. RMS is not an RL algorithm, but a method to generate a risk-aware reward shaping signal that can be used with a RL algorithm to generate risk-aware policies.

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Cheng, Richard, Gábor Orosz, Richard M. Murray y Joel W. Burdick. "End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 de julio de 2019): 3387–95. http://dx.doi.org/10.1609/aaai.v33i01.33013387.

Texto completo

Resumen

Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) online learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous carfollowing with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Jurj, Sorin Liviu, Dominik Grundt, Tino Werner, Philipp Borchers, Karina Rothemann y Eike Möhlmann. "Increasing the Safety of Adaptive Cruise Control Using Physics-Guided Reinforcement Learning". Energies 14, n.º 22 (12 de noviembre de 2021): 7572. http://dx.doi.org/10.3390/en14227572.

Texto completo

Resumen

This paper presents a novel approach for improving the safety of vehicles equipped with Adaptive Cruise Control (ACC) by making use of Machine Learning (ML) and physical knowledge. More exactly, we train a Soft Actor-Critic (SAC) Reinforcement Learning (RL) algorithm that makes use of physical knowledge such as the jam-avoiding distance in order to automatically adjust the ideal longitudinal distance between the ego- and leading-vehicle, resulting in a safer solution. In our use case, the experimental results indicate that the physics-guided (PG) RL approach is better at avoiding collisions at any selected deceleration level and any fleet size when compared to a pure RL approach, proving that a physics-informed ML approach is more reliable when developing safe and efficient Artificial Intelligence (AI) components in autonomous vehicles (AVs).

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Sakrihei, Helen. "Using automatic storage for ILL – experiences from the National Repository Library in Norway". Interlending & Document Supply 44, n.º 1 (15 de febrero de 2016): 14–16. http://dx.doi.org/10.1108/ilds-11-2015-0035.

Texto completo

Resumen

Purpose – The purpose of this paper is to share the Norwegian Repository Library (RL)’s experiences with an automatic storage for interlibrary lending (ILL). Design/methodology/approach – This paper describes how the RL uses the automatic storage to deliver ILL services to Norwegian libraries. Chaos storage is the main principle for storage. Findings – Using automatic storage for ILL is efficient, cost-effective and safe. Originality/value – The RL has used automatic storage since 2003, and it is one of a few libraries using this technology.

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Ding, Yuhao y Javad Lavaei. "Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 6 (26 de junio de 2023): 7396–404. http://dx.doi.org/10.1609/aaai.v37i6.25900.

Texto completo

Resumen

We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments. In this problem, the reward/utility functions and the state transition functions are both allowed to vary arbitrarily over time as long as their cumulative variations do not exceed certain known variation budgets. Designing safe RL algorithms in time-varying environments is particularly challenging because of the need to integrate the constraint violation reduction, safe exploration, and adaptation to the non-stationarity. To this end, we identify two alternative conditions on the time-varying constraints under which we can guarantee the safety in the long run. We also propose the Periodically Restarted Optimistic Primal-Dual Proximal Policy Optimization (PROPD-PPO) algorithm that can coordinate with both two conditions. Furthermore, a dynamic regret bound and a constraint violation bound are established for the proposed algorithm in both the linear kernel CMDP function approximation setting and the tabular CMDP setting under two alternative conditions. This paper provides the first provably efficient algorithm for non-stationary CMDPs with safe exploration.

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Tubeuf, Carlotta, Felix Birkelbach, Anton Maly y René Hofmann. "Increasing the Flexibility of Hydropower with Reinforcement Learning on a Digital Twin Platform". Energies 16, n.º 4 (11 de febrero de 2023): 1796. http://dx.doi.org/10.3390/en16041796.

Texto completo

Resumen

The increasing demand for flexibility in hydropower systems requires pumped storage power plants to change operating modes and compensate reactive power more frequently. In this work, we demonstrate the potential of applying reinforcement learning (RL) to control the blow-out process of a hydraulic machine during pump start-up and when operating in synchronous condenser mode. Even though RL is a promising method that is currently getting much attention, safety concerns are stalling research on RL for the control of energy systems. Therefore, we present a concept that enables process control with RL through the use of a digital twin platform. This enables the safe and effective transfer of the algorithm’s learning strategy from a virtual test environment to the physical asset. The successful implementation of RL in a test environment is presented and an outlook on future research on the transfer to a model test rig is given.

Los estilos APA, Harvard, Vancouver, ISO, etc.

11

YOON, JAE UNG y JUHONG LEE. "Uncertainty Sequence Modeling Approach for Safe and Effective Autonomous Driving". Korean Institute of Smart Media 11, n.º 9 (31 de octubre de 2022): 9–20. http://dx.doi.org/10.30693/smj.2022.11.9.9.

Texto completo

Resumen

Deep reinforcement learning(RL) is an end-to-end data-driven control method that is widely used in the autonomous driving domain. However, conventional RL approaches have difficulties in applying it to autonomous driving tasks due to problems such as inefficiency, instability, and uncertainty. These issues play an important role in the autonomous driving domain. Although recent studies have attempted to solve these problems, they are computationally expensive and rely on special assumptions. In this paper, we propose a new algorithm MCDT that considers inefficiency, instability, and uncertainty by introducing a method called uncertainty sequence modeling to autonomous driving domain. The sequence modeling method, which views reinforcement learning as a decision making generation problem to obtain high rewards, avoids the disadvantages of exiting studies and guarantees efficiency, stability and also considers safety by integrating uncertainty estimation techniques. The proposed method was tested in the OpenAI Gym CarRacing environment, and the experimental results show that the MCDT algorithm provides efficient, stable and safe performance compared to the existing reinforcement learning method.

Los estilos APA, Harvard, Vancouver, ISO, etc.

12

Lin, Xingbin, Deyu Yuan y Xifei Li. "Reinforcement Learning with Dual Safety Policies for Energy Savings in Building Energy Systems". Buildings 13, n.º 3 (21 de febrero de 2023): 580. http://dx.doi.org/10.3390/buildings13030580.

Texto completo

Resumen

Reinforcement learning (RL) is being gradually applied in the control of heating, ventilation and air-conditioning (HVAC) systems to learn the optimal control sequences for energy savings. However, due to the “trial and error” issue, the output sequences of RL may cause potential operational safety issues when RL is applied in real systems. To solve those problems, an RL algorithm with dual safety policies for energy savings in HVAC systems is proposed. In the proposed dual safety policies, the implicit safety policy is a part of the RL model, which integrates safety into the optimization target of RL, by adding penalties in reward for actions that exceed the safety constraints. In explicit safety policy, an online safety classifier is built to filter the actions outputted by RL; thus, only those actions that are classified as safe and have the highest benefits will be finally selected. In this way, the safety of controlled HVAC systems running with proposed RL algorithms can be effectively satisfied while reducing the energy consumptions. To verify the proposed algorithm, we implemented the control algorithm in a real existing commercial building. After a certain period of self-studying, the energy consumption of HVAC had been reduced by more than 15.02% compared to the proportional–integral–derivative (PID) control. Meanwhile, compared to the independent application of the RL algorithm without safety policy, the proportion of indoor temperature not meeting the demand is reduced by 25.06%.

Los estilos APA, Harvard, Vancouver, ISO, etc.

13

Marchesini, Enrico, Davide Corsi y Alessandro Farinelli. "Exploring Safer Behaviors for Deep Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 7 (28 de junio de 2022): 7701–9. http://dx.doi.org/10.1609/aaai.v36i7.20737.

Texto completo

Resumen

We consider Reinforcement Learning (RL) problems where an agent attempts to maximize a reward signal while minimizing a cost function that models unsafe behaviors. Such formalization is addressed in the literature using constrained optimization on the cost, limiting the exploration and leading to a significant trade-off between cost and reward. In contrast, we propose a Safety-Oriented Search that complements Deep RL algorithms to bias the policy toward safety within an evolutionary cost optimization. We leverage evolutionary exploration benefits to design a novel concept of safe mutations that use visited unsafe states to explore safer actions. We further characterize the behaviors of the policies over desired specifics with a sample-based bound estimation, which makes prior verification analysis tractable in the training loop. Hence, driving the learning process towards safer regions of the policy space. Empirical evidence on the Safety Gym benchmark shows that we successfully avoid drawbacks on the return while improving the safety of the policy.

Los estilos APA, Harvard, Vancouver, ISO, etc.

14

Egleston, David, Patricia Ann Castelli y Thomas George Marx. "Developing, validating, and testing a model of reflective leadership". Leadership & Organization Development Journal 38, n.º 7 (4 de septiembre de 2017): 886–96. http://dx.doi.org/10.1108/lodj-09-2016-0230.

Texto completo

Resumen

Purpose The purpose of this paper is to develop, validate, and test the impacts of reflective leadership (RL) on organizational performance. Design/methodology/approach This is an empirical study based on over 700 survey responses from business leaders around the world. An instrument was developed to validate the model, and the statistical significance of its impacts on organizational performance was tested. Findings The findings show that a model of RL consisting of three leadership practices, creating an open and safe work environment, defining purpose, and challenging assumptions had significant impacts on organizational performance, accounted for 16.5 percent of the variance in the accomplishment of organizational goals; 13.9 percent of the variance in sales; and 14.7 percent of the variance in profits. Research limitations/implications The major limitations are the biases that might be introduced with survey data. There are numerous implications for future research in terms of exploring additional RL practices, their impacts on additional and objective measures of performance, and in exploring the effects of moderating and mediating variables on the impacts of RL on performance. Practical implications The results show that RL is an effective management tool practitioners can employ to improve organizational performance. Originality/value A number of studies have broadly suggested that RL improves organizational performance, but this study empirically tests the impacts of a clearly defined, validated model of leadership on specific measures of performance.

Los estilos APA, Harvard, Vancouver, ISO, etc.

15

Huh, Gene y Wonjae Cha. "Development and Clinical Application of Real-Time Light-Guided Vocal Fold Injection". Journal of The Korean Society of Laryngology, Phoniatrics and Logopedics 33, n.º 1 (30 de abril de 2022): 1–6. http://dx.doi.org/10.22469/jkslp.2022.33.1.1.

Texto completo

Resumen

Vocal fold injection (VFI) is widely accepted as a first line treatment in treating unilateral vocal fold paralysis and other vocal fold diseases. Although VFI is advantageous for its minimal invasiveness and efficiency, the invisibility of the needle tip remains an essential handicap in precise localization. Real-time light-guided vocal fold injection (RL-VFI) is a novel technique that was developed under the concept of performing simultaneous injection with precise placement of the needle tip under light guidance. RL-VFI has confirmed its possibility of technical implementation and the feasibility in injecting the needle from various directions through ex vivo animal studies. Further in vivo animal study has approved the safety and feasibility of the procedure when various transcutaneous approaches were applied. Currently, RL-VFI device is authorized for clinical use by the Ministry of Food and Drug Safety in South Korea and is clinically applied to patients with safe and favorable outcome. Several clinical studies are currently under process to approve the safety and the efficiency of RL-VFI. RL-VFI is expected to improve the complication rate and the functional outcome of voice. Furthermore, it will support laryngologists in overcoming the steep learning curve by its intuitive guidance.

Los estilos APA, Harvard, Vancouver, ISO, etc.

16

Ramakrishnan, Ramya, Ece Kamar, Debadeepta Dey, Eric Horvitz y Julie Shah. "Blind Spot Detection for Safe Sim-to-Real Transfer". Journal of Artificial Intelligence Research 67 (4 de febrero de 2020): 191–234. http://dx.doi.org/10.1613/jair.1.11436.

Texto completo

Resumen

Agents trained in simulation may make errors when performing actions in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult for the agent to discover because the agent is unable to predict them a priori. In this work, we propose the use of oracle feedback to learn a predictive model of these blind spots in order to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: when the agent lacks necessary features to represent the true state of the world, and thus cannot distinguish between numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. Our system learns models for predicting blind spots within unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. These models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach across two domains and demonstrate that it achieves higher predictive performance than baseline methods, and also that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how these biases influence the discovery of blind spots. Further, we include analyses of our approach that incorporate relaxed initial optimality assumptions. (Interestingly, relaxing the assumptions of an optimal oracle and an optimal simulator policy helped our models to perform better.) We also propose extensions to our method that are intended to improve performance when using corrections and demonstrations data.

Los estilos APA, Harvard, Vancouver, ISO, etc.

17

Hao, Hao, Yichen Sun, Xueyun Mei y Yanjun Zhou. "Reverse Logistics Network Design of Electric Vehicle Batteries considering Recall Risk". Mathematical Problems in Engineering 2021 (18 de agosto de 2021): 1–16. http://dx.doi.org/10.1155/2021/5518049.

Texto completo

Resumen

In 2018-2019, the recall scale of electric vehicles (EVs) in China reached 168,700 units; recalls account for approximately 6.9% of sales volume. There are imperative reasons for electric vehicle batteries (EVBs) recalls, such as mandatory laws or policies, safety and environmental pollution risks, and the high value of EVB echelon use, and thus, it has become increasingly important to reasonably design a reverse logistics (RL) network for an EVB recall. In this study, a multiobjective and multiperiod recall RL network model is developed to minimize safety and environmental risks, maximize the social responsibility and economic benefits, and consider the characteristics of EVBs, including the configuration of key recall facilities and the control of recall flows. The results of this study will help EVB practitioners, relevant departmental policymakers, and others to comprehensively understand the recall of EVBs, strengthen the safety and environmental protection issues in the EVB recall process, and promote the establishment of a safe, green, and sustainable EVB recall RL network.

Los estilos APA, Harvard, Vancouver, ISO, etc.

18

Ray, Kaustabha y Ansuman Banerjee. "Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning". ACM Transactions on Embedded Computing Systems 20, n.º 6 (30 de noviembre de 2021): 1–33. http://dx.doi.org/10.1145/3475991.

Texto completo

Resumen

Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.

Los estilos APA, Harvard, Vancouver, ISO, etc.

19

Delgado, Tomás, Marco Sánchez Sorondo, Víctor Braberman y Sebastián Uchitel. "Exploration Policies for On-the-Fly Controller Synthesis: A Reinforcement Learning Approach". Proceedings of the International Conference on Automated Planning and Scheduling 33, n.º 1 (1 de julio de 2023): 569–77. http://dx.doi.org/10.1609/icaps.v33i1.27238.

Texto completo

Resumen

Controller synthesis is in essence a case of model-based planning for non-deterministic environments in which plans (actually “strategies”) are meant to preserve system goals indefinitely. In the case of supervisory control environments are specified as the parallel composition of state machines and valid strategies are required to be “non-blocking” (i.e., always enabling the environment to reach certain marked states) in addition to safe (i.e., keep the system within a safe zone). Recently, On-the-fly Directed Controller Synthesis techniques were proposed to avoid the exploration of the entire -and exponentially large- environment space, at the cost of non-maximal permissiveness, to either find a strategy or conclude that there is none. The incremental exploration of the plant is currently guided by a domain-independent human-designed heuristic. In this work, we propose a new method for obtaining heuristics based on Reinforcement Learning (RL). The synthesis algorithm is thus framed as an RL task with an unbounded action space and a modified version of DQN is used. With a simple and general set of features that abstracts both states and actions, we show that it is possible to learn heuristics on small versions of a problem that generalize to the larger instances, effectively doing zero-shot policy transfer. Our agents learn from scratch in a highly partially observable RL task and outperform the existing heuristic overall, in instances unseen during training.

Los estilos APA, Harvard, Vancouver, ISO, etc.

20

Bolster, Lauren, Mark Bosch, Brian Brownbridge y Anurag Saxena. "RAP Trial: Ringer's Lactate and Packed Red Blood Cell Transfusion, An in Vitro Study and Chart Review." Blood 114, n.º 22 (20 de noviembre de 2009): 2105. http://dx.doi.org/10.1182/blood.v114.22.2105.2105.

Texto completo

Resumen

Abstract Abstract 2105 Poster Board II-82 Background: The Canadian Blood Services and the American Association of Blood Banks state that intravenous solution administered with packed red blood cells (pRBC) must be isotonic and must not contain calcium or glucose. This recommendation is based on in vitro investigations demonstrating that calcium containing solutions can initiate in vitro coagulation in citrated blood (Ryden 1975, Dickson 1980, Lorenzo 1998). Recently this recommendation has been challenged by in vitro studies that combined AS-3 pRBC with Ringer's Lactate (RL) (Albert 2009). Currently there are anaesthetists that use RL with pRBC for intra-operative transfusions. The purpose of this study was to evaluate whether RL, as compared with normal saline (NS), can safely be used in transfusion therapy. Methods: Eleven units of AS-3 pRBC were obtained for this study. In part A, multiple dilutions of blood with RL or NS were assessed for clot, both visually using a 20 micron filter and molecularly using F1+2 ELISA. In part B, blood was run through a standard gravity filter with NS or RL, to simulate both ward and intra-operative transfusion practice. The blood was then assessed for clot at times 30, 60, 120, and 240 minutes. In part C, patients who received intra-operative transfusions of pRBC with RL were identified. These charts were reviewed for evidence of transfusion reactions including: TRALI, arterial or venous thrombosis, coagulopathy and mortality. Results: In part A, none of the filters had evidence of visible clot, nor evidence of thrombin generation at supraphysiologic levels, >300 pmol/L (Ota 2008, Pelzer 1991). In part B, there was no visible evidence of clot at the preselected time points. Over four hours the NS + blood group F1+2 levels ranged from 12- 267 pmol/L, and the RL + blood F1+2 levels ranged from 14-218 pmol/L. In the transfusion set primed with blood and then RL added, a simulation of common operating room transfusion practice, the F1+2 ELISA levels ranged from 13-435 pmol/L. There were no statistically significant difference in the ELISA F1+2 between the NS and RL groups (NS + blood vs. RL + blood p= 0.547, NS + blood vs. blood +RL p= 0.663). Nine patients totaling 36 units of pRBC transfused with RL were reviewed. The transfusion times ranging from 15-95 minutes, with an average transfusion time of 30 minutes per unit of pRBC. There was no evidence of transfusion related adverse events identified. Discussion: In addition to our results confirming recent studies that demonstrate in vitro compatibility of pRBC with RL, our study is the first comprehensive study involving visual clot, molecular clot and intra-operative transfusion of pRBC with RL. We have utilized a 20 micron filter, whereas previous studies utilized 40 micron filters. Further, in contrast to other studies we are the first to investigate the effects of both the concentration of RL with blood and the transfusion duration; no other study has looked at four hour transfusions with RL. Conclusion: Our study adds credibility to the hypothesis that RL is safe for clinical transfusion, including intra-operative and ward transfusion practice. Although new evidence now challenges international transfusion guidelines, a larger study should be conducted before transfusion with RL is adopted into widespread clinical practice. Disclosures: No relevant conflicts of interest to declare.

Los estilos APA, Harvard, Vancouver, ISO, etc.

21

Romey, Aurore, Hussaini G. Ularamu, Abdulnaci Bulut, Syed M. Jamal, Salman Khan, Muhammad Ishaq, Michael Eschbaumer et al. "Field Evaluation of a Safe, Easy, and Low-Cost Protocol for Shipment of Samples from Suspected Cases of Foot-and-Mouth Disease to Diagnostic Laboratories". Transboundary and Emerging Diseases 2023 (5 de agosto de 2023): 1–15. http://dx.doi.org/10.1155/2023/9555213.

Texto completo

Resumen

Identification and characterization of the foot-and-mouth disease virus (FMDV) strains circulating in endemic countries and their dynamics are essential elements of the global FMD control strategy. Characterization of FMDV is usually performed in reference laboratories (RL). However, shipping of FMD samples to RL is a challenge due to the cost and biosafety requirements of transportation, resulting in a lack of knowledge about the strains circulating in some endemic areas. In order to simplify this step and to encourage sample submission to RL, we have previously developed a low-cost protocol for the shipment of FMD samples based on the use of lateral flow devices (LFDs) combined with a simple virus inactivation step using 0.2% citric acid. The present study aimed to evaluate this inactivation protocol in the field. For this purpose, 60 suspected FMD clinical samples were collected in Nigeria, Pakistan, and Turkey, three countries where FMD is endemic. Sample treatment, testing on LFDs, and virus inactivation steps were performed in the field when possible. The effectiveness of the virus inactivation was confirmed at the RL. After RNA extraction from the 60 inactivated LFDs, all were confirmed as FMDV positive by real-time reverse transcription polymerase chain reaction (RT-PCR). The serotype was identified by conventional RT-PCR for 86% of the samples. The topotype and/or lineage was successfully determined for 60% of the samples by Sanger sequencing and sequence analyses. After chemical transfection of RNA extracted from inactivated LFDs, into permissive cells, infectious virus was rescued from 15% of the samples. Implementation of this user-friendly protocol can substantially reduce shipping costs, which should increase the submission of field samples and therefore improve knowledge of the circulating FMDV strains.

Los estilos APA, Harvard, Vancouver, ISO, etc.

22

Dai, Juntao, Jiaming Ji, Long Yang, Qian Zheng y Gang Pan. "Augmented Proximal Policy Optimization for Safe Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 6 (26 de junio de 2023): 7288–95. http://dx.doi.org/10.1609/aaai.v37i6.25888.

Texto completo

Resumen

Safe reinforcement learning considers practical scenarios that maximize the return while satisfying safety constraints. Current algorithms, which suffer from training oscillations or approximation errors, still struggle to update the policy efficiently with precise constraint satisfaction. In this article, we propose Augmented Proximal Policy Optimization (APPO), which augments the Lagrangian function of the primal constrained problem via attaching a quadratic deviation term. The constructed multiplier-penalty function dampens cost oscillation for stable convergence while being equivalent to the primal constrained problem to precisely control safety costs. APPO alternately updates the policy and the Lagrangian multiplier via solving the constructed augmented primal-dual problem, which can be easily implemented by any first-order optimizer. We apply our APPO methods in diverse safety-constrained tasks, setting a new state of the art compared with a comprehensive list of safe RL baselines. Extensive experiments verify the merits of our method in easy implementation, stable convergence, and precise cost control.

Los estilos APA, Harvard, Vancouver, ISO, etc.

23

Krstić, Mladen, Giulio Paolo Agnusdei, Pier Paolo Miglietta, Snežana Tadić y Violeta Roso. "Applicability of Industry 4.0 Technologies in the Reverse Logistics: A Circular Economy Approach Based on COmprehensive Distance Based RAnking (COBRA) Method". Sustainability 14, n.º 9 (7 de mayo de 2022): 5632. http://dx.doi.org/10.3390/su14095632.

Texto completo

Resumen

The logistics sector plays one of the most important roles in the supply chain with the aim of providing a fast, flexible, safe, economical, efficient, and environmentally acceptable performance of freight transport flows. In addition, the popularization of the concept of a circular economy (CE) used to retain goods, components, and materials at their highest usability and value at all times, illustrates the importance of the adequate performance of reverse logistics (RL) processes. However, traditional RL is unable to cope with the requirements of modern supply chains and requires the application of Industry 4.0 technologies, which would make it more efficient. The main aim of this study was to evaluate the applicability of various Industry 4.0 technologies in the RL sector in order to point out the most applicable ones. To solve the defined problem, a novel multi-criteria decision making (MCDM) model was defined by combining the best—worst method (BWM) to obtain the criteria weights, and the newly developed comprehensive distance-based ranking (COBRA) method to rank the technologies. Another aim of the study was to validate the newly established method. The results indicated that the most applicable technologies were the Internet of Things, cloud computing, and electronic—mobile marketplaces. These technologies will have a significant impact on the development of RL and the establishment of CE systems, thus bringing about all the related positive effects.

Los estilos APA, Harvard, Vancouver, ISO, etc.

24

Prasetyo, Risky Vitria, Abdul Latief Azis y Soegeng Soegijanto. "Comparison of the efficacy and safety of hydroxyethyl starch 130/0.4 and Ringer's lactate in children with grade III dengue hemorrhagic fever". Paediatrica Indonesiana 49, n.º 2 (30 de abril de 2009): 97. http://dx.doi.org/10.14238/pi49.2.2009.97-103.

Texto completo

Resumen

Background Theoretically hydroxyethyl starch (HES) will givemore rapid recovery from shock, including in dengue shocksyndrome (DSS) and currently gained popularity for its lessdeleterious effects on renal function and blood coagulation.Objectives To compare the efficacy and safety ofHES 130/0.4 andRinger's lactate (RL) for shock recovery in children with DSS.Methods A randomized controlled study was performed on 39children admitted with DSS at Dr. Soetomo Hospital, Surabaya,between March and May 2007. Children were grouped intograde III (n=25) and grade IV (n=14) dengue hemorrhagicfever (DHF) according to the WHO criteria. Within eachgroup, subjects were randomly assigned to receive initial fluidresuscitation with either HES 130/0.4 (n=9 in the DHF grade IIIgroup, 10 in the DHF grade IV) or RL (n= 16 in the DHF gradeIII group, 4 in the DHF grade IV). Clinical and laboratory datawere collected to determine improvements in shock recovery andadverse reactions.Results In both the grades III and IV DHF, HES 130/0.4significantly decreased hemoglobin and hematocrit levels. Clinicalimprovements in pulse pressure and pulse rate were seen aftertreatment with HES 130/0.4 although these were statisticallyinsignificant if compared to the RL group. No differences in fluidrequirement and recurrent shock episodes were noted betweenthe RL and HES groups. No adverse reactions were found duringthe study.Conclusion HES 130/0.4 administration is effective and safe inchildren with DSS.

Los estilos APA, Harvard, Vancouver, ISO, etc.

25

Böck, Markus, Julien Malle, Daniel Pasterk, Hrvoje Kukina, Ramin Hasani y Clemens Heitzinger. "Superhuman performance on sepsis MIMIC-III data by distributional reinforcement learning". PLOS ONE 17, n.º 11 (3 de noviembre de 2022): e0275358. http://dx.doi.org/10.1371/journal.pone.0275358.

Texto completo

Resumen

We present a novel setup for treating sepsis using distributional reinforcement learning (RL). Sepsis is a life-threatening medical emergency. Its treatment is considered to be a challenging high-stakes decision-making problem, which has to procedurally account for risk. Treating sepsis by machine learning algorithms is difficult due to a couple of reasons: There is limited and error-afflicted initial data in a highly complex biological system combined with the need to make robust, transparent and safe decisions. We demonstrate a suitable method that combines data imputation by a kNN model using a custom distance with state representation by discretization using clustering, and that enables superhuman decision-making using speedy Q-learning in the framework of distributional RL. Compared to clinicians, the recovery rate is increased by more than 3% on the test data set. Our results illustrate how risk-aware RL agents can play a decisive role in critical situations such as the treatment of sepsis patients, a situation acerbated due to the COVID-19 pandemic (Martineau 2020). In addition, we emphasize the tractability of the methodology and the learning behavior while addressing some criticisms of the previous work (Komorowski et al. 2018) on this topic.

Los estilos APA, Harvard, Vancouver, ISO, etc.

26

Li, Yue, Xiao Yong Bai, Shi Jie Wang, Luo Yi Qin, Yi Chao Tian y Guang Jie Luo. "Evaluating of the spatial heterogeneity of soil loss tolerance and its effects on erosion risk in the carbonate areas of southern China". Solid Earth 8, n.º 3 (29 de mayo de 2017): 661–69. http://dx.doi.org/10.5194/se-8-661-2017.

Texto completo

Resumen

Abstract. Soil loss tolerance (T value) is one of the criteria in determining the necessity of erosion control measures and ecological restoration strategy. However, the validity of this criterion in subtropical karst regions is strongly disputed. In this study, T value is calculated based on soil formation rate by using a digital distribution map of carbonate rock assemblage types. Results indicated a spatial heterogeneity and diversity in soil loss tolerance. Instead of only one criterion, a minimum of three criteria should be considered when investigating the carbonate areas of southern China because the one region, one T value concept may not be applicable to this region. T value is proportionate to the amount of argillaceous material, which determines the surface soil thickness of the formations in homogenous carbonate rock areas. Homogenous carbonate rock, carbonate rock intercalated with clastic rock areas and carbonate/clastic rock alternation areas have T values of 20, 50 and 100 t/(km2 a), and they are extremely, severely and moderately sensitive to soil erosion. Karst rocky desertification (KRD) is defined as extreme soil erosion and reflects the risks of erosion. Thus, the relationship between T value and erosion risk is determined using KRD as a parameter. The existence of KRD land is unrelated to the T value, although this parameter indicates erosion sensitivity. Erosion risk is strongly dependent on the relationship between real soil loss (RL) and T value rather than on either erosion intensity or the T value itself. If RL > > T, then the erosion risk is high despite of a low RL. Conversely, if T > > RL, then the soil is safe although RL is high. Overall, these findings may clarify the heterogeneity of T value and its effect on erosion risk in a karst environment.

Los estilos APA, Harvard, Vancouver, ISO, etc.

27

Kondrup, Flemming, Thomas Jiralerspong, Elaine Lau, Nathan De Lara, Jacob Shkrob, My Duc Tran, Doina Precup y Sumana Basu. "Towards Safe Mechanical Ventilation Treatment Using Deep Offline Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 13 (26 de junio de 2023): 15696–702. http://dx.doi.org/10.1609/aaai.v37i13.26862.

Texto completo

Resumen

Mechanical ventilation is a key form of life support for patients with pulmonary impairment. Healthcare workers are required to continuously adjust ventilator settings for each patient, a challenging and time consuming task. Hence, it would be beneficial to develop an automated decision support tool to optimize ventilation treatment. We present DeepVent, a Conservative Q-Learning (CQL) based offline Deep Reinforcement Learning (DRL) agent that learns to predict the optimal ventilator parameters for a patient to promote 90 day survival. We design a clinically relevant intermediate reward that encourages continuous improvement of the patient vitals as well as addresses the challenge of sparse reward in RL. We find that DeepVent recommends ventilation parameters within safe ranges, as outlined in recent clinical trials. The CQL algorithm offers additional safety by mitigating the overestimation of the value estimates of out-of-distribution states/actions. We evaluate our agent using Fitted Q Evaluation (FQE) and demonstrate that it outperforms physicians from the MIMIC-III dataset.

Los estilos APA, Harvard, Vancouver, ISO, etc.

28

Miyajima, Hirofumi, Noritaka Shigei, Syunki Makino, Hiromi Miyajima, Yohtaro Miyanishi, Shinji Kitagami y Norio Shiratori. "A proposal of privacy preserving reinforcement learning for secure multiparty computation". Artificial Intelligence Research 6, n.º 2 (23 de mayo de 2017): 57. http://dx.doi.org/10.5430/air.v6n2p57.

Texto completo

Resumen

Many studies have been done with the security of cloud computing. Though data encryption is a typical approach, high computing complexity for encryption and decryption of data is needed. Therefore, safe system for distributed processing with secure data attracts attention, and a lot of studies have been done. Secure multiparty computation (SMC) is one of these methods. Specifically, two learning methods for machine learning (ML) with SMC are known. One is to divide learning data into several subsets and perform learning. The other is to divide each item of learning data and perform learning. So far, most of works for ML with SMC are ones with supervised and unsupervised learning such as BP and K-means methods. It seems that there does not exist any studies for reinforcement learning (RL) with SMC. This paper proposes learning methods with SMC for Q-learning which is one of typical methods for RL. The effectiveness of proposed methods is shown by numerical simulation for the maze problem.

Los estilos APA, Harvard, Vancouver, ISO, etc.

29

Thananjeyan, Brijen, Ashwin Balakrishna, Ugo Rosolia, Felix Li, Rowan McAllister, Joseph E. Gonzalez, Sergey Levine, Francesco Borrelli y Ken Goldberg. "Safety Augmented Value Estimation From Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks". IEEE Robotics and Automation Letters 5, n.º 2 (abril de 2020): 3612–19. http://dx.doi.org/10.1109/lra.2020.2976272.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

30

Ren, Tianzhu, Yuanchang Xie y Liming Jiang. "Cooperative Highway Work Zone Merge Control Based on Reinforcement Learning in a Connected and Automated Environment". Transportation Research Record: Journal of the Transportation Research Board 2674, n.º 10 (17 de julio de 2020): 363–74. http://dx.doi.org/10.1177/0361198120935873.

Texto completo

Resumen

Given the aging infrastructure and the anticipated growing number of highway work zones in the U.S.A., it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed method assumes that all vehicles are fully automated, connected, and cooperative. It inserts two metering zones in the open lane to make space for merging vehicles in the closed lane. In addition, each vehicle in the closed lane learns how to adjust its longitudinal position optimally to find a safe gap in the open lane using an off-policy soft actor critic reinforcement learning (RL) algorithm, considering its surrounding traffic conditions. The learning results are captured in convolutional neural networks and used to control individual vehicles in the testing phase. By adding the metering zones and taking the locations, speeds, and accelerations of surrounding vehicles into account, cooperation among vehicles is implicitly considered. This RL-based model is trained and evaluated using a microscopic traffic simulator. The results show that this cooperative RL-based merge control significantly outperforms popular strategies such as late merge and early merge in terms of both mobility and safety measures. It also performs better than a strategy assuming all vehicles are equipped with cooperative adaptive cruise control.

Los estilos APA, Harvard, Vancouver, ISO, etc.

31

Reda, Ahmad y József Vásárhelyi. "Design and Implementation of Reinforcement Learning for Automated Driving Compared to Classical MPC Control". Designs 7, n.º 1 (29 de enero de 2023): 18. http://dx.doi.org/10.3390/designs7010018.

Texto completo

Resumen

Many classic control approaches have already proved their merits in the automotive industry. Model predictive control (MPC) is one of the most commonly used methods. However, its efficiency drops off with increase in complexity of the driving environment. Recently, machine learning methods have been considered an efficient alternative to classical control approaches. Even with successful implementation of reinforcement learning in real-world applications, it is still not commonly used compared to supervised and unsupervised learning. In this paper, a reinforcement learning (RL)-based framework is suggested for application in autonomous driving systems to maintain a safe distance. Additionally, an MPC-based control model is designed for the same task. The behavior of the two controllers is compared and discussed. The trained RL model was deployed on a low-end FPGA-in-the-loop (field-programmable gate array in-the-loop). The results showed that the two controllers responded efficiently to changes in the environment. Specifically, the response of the RL controller was faster, at approximately 1.75 s, than that of the MPC controller, while the MPC provided better overshooting performance (approximately 1.3 m/s less) in terms of following the reference speeds. The reinforcement-learning model showed efficient behavior after being deployed on the FPGA with (4.9×10−6) m2/s as a maximum deviation compared to MATLAB Simulink.

Los estilos APA, Harvard, Vancouver, ISO, etc.

32

Gardille, Arnaud y Ola Ahmad. "Towards Safe Reinforcement Learning via OOD Dynamics Detection in Autonomous Driving System (Student Abstract)". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 13 (26 de junio de 2023): 16216–17. http://dx.doi.org/10.1609/aaai.v37i13.26968.

Texto completo

Resumen

Deep reinforcement learning (DRL) has proven effective in training agents to achieve goals in complex environments. However, a trained RL agent may exhibit, during deployment, unexpected behavior when faced with a situation where its state transitions differ even slightly from the training environment. Such a situation can arise for a variety of reasons. Rapid and accurate detection of anomalous behavior appears to be a prerequisite for using DRL in safety-critical systems, such as autonomous driving. We propose a novel OOD detection algorithm based on modeling the transition function of the training environment. Our method captures the bias of model behavior when encountering subtle changes of dynamics while maintaining a low false positive rate. Preliminary evaluations on the realistic simulator CARLA corroborate the relevance of our proposed method.

Los estilos APA, Harvard, Vancouver, ISO, etc.

33

Free, David. "In the News". College & Research Libraries News 80, n.º 10 (5 de noviembre de 2019): 541. http://dx.doi.org/10.5860/crln.80.10.541.

Texto completo

Resumen

Welcome to the November 2019 issue of C&RL News. Many academic libraries have begun focusing efforts on addressing the mental health and well being of their populations. Marshall University in West Virginia, one of the states hit hardest by the recent opioid crises, focused on their libraries as mental health safe spaces. Sabrina Thomas and Kacy Lovelace discuss their collaborative campus project in “Combining efforts.” Learn more about resources available for “Mental health awareness” in this month’s Internet Resources article by Emily Underwood.

Los estilos APA, Harvard, Vancouver, ISO, etc.

34

Xu, Xibao, Yushen Chen y Chengchao Bai. "Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing". Sensors 21, n.º 23 (6 de diciembre de 2021): 8161. http://dx.doi.org/10.3390/s21238161.

Texto completo

Resumen

Planetary soft landing has been studied extensively due to its promising application prospects. In this paper, a soft landing control algorithm based on deep reinforcement learning (DRL) with good convergence property is proposed. First, the soft landing problem of the powered descent phase is formulated and the theoretical basis of Reinforcement Learning (RL) used in this paper is introduced. Second, to make it easier to converge, a reward function is designed to include process rewards like velocity tracking reward, solving the problem of sparse reward. Then, by including the fuel consumption penalty and constraints violation penalty, the lander can learn to achieve velocity tracking goal while saving fuel and keeping attitude angle within safe ranges. Then, simulations of training are carried out under the frameworks of Deep deterministic policy gradient (DDPG), Twin Delayed DDPG (TD3), and Soft Actor Critic (SAC), respectively, which are of the classical RL frameworks, and all converged. Finally, the trained policy is deployed into velocity tracking and soft landing experiments, results of which demonstrate the validity of the algorithm proposed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

35

Simão, Thiago D., Marnix Suilen y Nils Jansen. "Safe Policy Improvement for POMDPs via Finite-State Controllers". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 12 (26 de junio de 2023): 15109–17. http://dx.doi.org/10.1609/aaai.v37i12.26763.

Texto completo

Resumen

We study safe policy improvement (SPI) for partially observable Markov decision processes (POMDPs). SPI is an offline reinforcement learning (RL) problem that assumes access to (1) historical data about an environment, and (2) the so-called behavior policy that previously generated this data by interacting with the environment. SPI methods neither require access to a model nor the environment itself, and aim to reliably improve upon the behavior policy in an offline manner. Existing methods make the strong assumption that the environment is fully observable. In our novel approach to the SPI problem for POMDPs, we assume that a finite-state controller (FSC) represents the behavior policy and that finite memory is sufficient to derive optimal policies. This assumption allows us to map the POMDP to a finite-state fully observable MDP, the history MDP. We estimate this MDP by combining the historical data and the memory of the FSC, and compute an improved policy using an off-the-shelf SPI algorithm. The underlying SPI method constrains the policy space according to the available data, such that the newly computed policy only differs from the behavior policy when sufficient data is available. We show that this new policy, converted into a new FSC for the (unknown) POMDP, outperforms the behavior policy with high probability. Experimental results on several well-established benchmarks show the applicability of the approach, even in cases where finite memory is not sufficient.

Los estilos APA, Harvard, Vancouver, ISO, etc.

36

Zhang, Linrui, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang y Dacheng Tao. "Evaluating Model-Free Reinforcement Learning toward Safety-Critical Tasks". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 12 (26 de junio de 2023): 15313–21. http://dx.doi.org/10.1609/aaai.v37i12.26786.

Texto completo

Resumen

Safety comes first in many real-world applications involving autonomous agents. Despite a large number of reinforcement learning (RL) methods focusing on safety-critical tasks, there is still a lack of high-quality evaluation of those algorithms that adheres to safety constraints at each decision step under complex and unknown dynamics. In this paper, we revisit prior work in this scope from the perspective of state-wise safe RL and categorize them as projection-based, recovery-based, and optimization-based approaches, respectively. Furthermore, we propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. This novel technique explicitly enforces hard constraints via the deep unrolling architecture and enjoys structural advantages in navigating the trade-off between reward improvement and constraint satisfaction. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit, a toolkit that provides off-the-shelf interfaces and evaluation utilities for safety-critical tasks. We then perform a comparative study of the involved algorithms on six benchmarks ranging from robotic control to autonomous driving. The empirical results provide an insight into their applicability and robustness in learning zero-cost-return policies without task-dependent handcrafting. The project page is available at https://sites.google.com/view/saferlkit.

Los estilos APA, Harvard, Vancouver, ISO, etc.

37

Angele, Martin K., Nadia Smail, Markus W. Knöferl, Alfred Ayala, William G. Cioffi y Irshad H. Chaudry. "l-Arginine restores splenocyte functions after trauma and hemorrhage potentially by improving splenic blood flow". American Journal of Physiology-Cell Physiology 276, n.º 1 (1 de enero de 1999): C145—C151. http://dx.doi.org/10.1152/ajpcell.1999.276.1.c145.

Texto completo

Resumen

Several studies indicate that immune responses are markedly depressed early after onset of hemorrhage. Decreased organ blood flow has been implicated in the pathophysiology of altered immune responses after trauma-hemorrhage. In this regard, administration ofl-arginine has been shown to restore depressed intestinal and hepatic blood flow after trauma-hemorrhage, probably due to provision of substrate for constitutive nitric oxide synthase (cNOS). It remains unknown, however, whether administration ofl-arginine also ameliorates depressed splenic blood flow and whether this agent has any salutary effects on depressed splenocyte functions after trauma-hemorrhage. Male rats underwent sham operation or laparotomy and were bled to and maintained at a mean arterial blood pressure of 40 mmHg until 40% of maximum shed blood volume (MBV) was returned as Ringer lactate (RL). Hemorrhaged rats were then resuscitated with RL (4 times MBV over 1 h). During resuscitation, rats received 300 mg/kgl-arginine or saline (vehicle) intravenously; 4 h later, splenic blood flow, splenocyte proliferation, and splenocyte interleukin (IL)-2 and IL-3 were determined. Administration of l-arginine improved depressed splenic blood flow and restored depressed splenocyte functions after trauma-hemorrhage. Therefore, provision ofl-arginine during resuscitation after trauma-hemorrhage should be considered a novel and safe approach for improving splenic organ blood flow and depressed splenocyte functions under such conditions.

Los estilos APA, Harvard, Vancouver, ISO, etc.

38

Staessens, Tom, Tom Lefebvre y Guillaume Crevecoeur. "Optimizing Cascaded Control of Mechatronic Systems through Constrained Residual Reinforcement Learning". Machines 11, n.º 3 (20 de marzo de 2023): 402. http://dx.doi.org/10.3390/machines11030402.

Texto completo

Resumen

Cascaded control structures are prevalent in industrial systems with many disturbances to obtain stable control but are cumbersome and challenging to tune. In this work, we propose cascaded constrained residual reinforcement learning (RL), an intuitive method that allows to improve the performance of a cascaded control structure while maintaining safe operation at all times. We draw inspiration from the constrained residual RL framework, in which a constrained reinforcement learning agent learns corrective adaptations to a base controller’s output to increase optimality. We first revisit the interplay between the residual agent and the baseline controller and subsequently extend this to the cascaded case. We analyze the differences and challenges this structure brings and derive some principle insights from this into the stability and operation of the cascaded residual architecture. Next, we propose a novel actor structure to enable efficient learning under the cascaded setting. We show that the standard algorithm is suboptimal for application to cascaded control structures and validate our method on a high-fidelity simulator of a dual motor drivetrain, resulting in a performance improvement of 14.7% on average, with only a minor decrease in performance occurring during the training phase. We study the different principles constituting the method and examine and validate their contribution to the algorithm’s performance under the considered cascaded control structure.

Los estilos APA, Harvard, Vancouver, ISO, etc.

39

Lv, Kexuan, Xiaofei Pei, Ci Chen y Jie Xu. "A Safe and Efficient Lane Change Decision-Making Strategy of Autonomous Driving Based on Deep Reinforcement Learning". Mathematics 10, n.º 9 (5 de mayo de 2022): 1551. http://dx.doi.org/10.3390/math10091551.

Texto completo

Resumen

As an indispensable branch of machine learning (ML), reinforcement learning (RL) plays a prominent role in the decision-making process of autonomous driving (AD), which enables autonomous vehicles (AVs) to learn an optimal driving strategy through continuous interaction with the environment. This paper proposes a deep reinforcement learning (DRL)-based motion planning strategy for AD tasks in the highway scenarios where an AV merges into two-lane road traffic flow and realizes the lane changing (LC) maneuvers. We integrate the DRL model into the AD system relying on the end-to-end learning method. An improved DRL algorithm based on deep deterministic policy gradient (DDPG) is developed with well-defined reward functions. In particular, safety rules (SR), safety prediction (SP) module and trauma memory (TM) as well as the dynamic potential-based reward shaping (DPBRS) function are adopted to further enhance safety and accelerate learning of the LC behavior. For validation, the proposed DSSTD algorithm is trained and tested on the dual-computer co-simulation platform. The comparative experimental results show that our proposal outperforms other benchmark algorithms in both driving safety and efficiency.

Los estilos APA, Harvard, Vancouver, ISO, etc.

40

Jurj, Sorin Liviu, Tino Werner, Dominik Grundt, Willem Hagemann y Eike Möhlmann. "Towards Safe and Sustainable Autonomous Vehicles Using Environmentally-Friendly Criticality Metrics". Sustainability 14, n.º 12 (7 de junio de 2022): 6988. http://dx.doi.org/10.3390/su14126988.

Texto completo

Resumen

This paper presents a mathematical analysis of several criticality metrics used for evaluating the safety of autonomous vehicles (AVs) and also proposes novel environmentally-friendly metrics with the scope of facilitating their selection by future researchers who want to evaluate both safety and the environmental impact of AVs. Regarding this, first, we investigate whether the criticality metrics which are used to quantify the severeness of critical situations in autonomous driving are well-defined and work as intended. In some cases, the well-definedness or the intendedness of the metrics will be apparent, but in other cases, we will present mathematical demonstrations of these properties as well as alternative novel formulas. Additionally, we also present details regarding optimality. Secondly, we propose several novel environmentally-friendly metrics as well as a novel environmentally-friendly criticality metric that combines the safety and environmental impact in a car-following scenario. Third, we discuss the possibility of applying these criticality metrics in artificial intelligence (AI) training such as reinforcement learning (RL) where they can be used as penalty terms such as negative reward components. Finally, we propose a way to apply some of the metrics in a simple car-following scenario and show in our simulation that AVs powered by petrol emitted the most carbon emissions (54.92 g of CO2), being followed closely by diesel-powered AVs (54.67 g of CO2) and then by grid-electricity-powered AVs (31.16 g of CO2). Meanwhile, the AVs powered by electricity from a green source, such as solar energy, had 0 g of CO2 emissions, encouraging future researchers and the industry to develop more actively sustainable methods and metrics for powering and evaluating the safety and environmental impact of AVs using green energy.

Los estilos APA, Harvard, Vancouver, ISO, etc.

41

Maw, Aye Aye, Maxim Tyan, Tuan Anh Nguyen y Jae-Woo Lee. "iADA*-RL: Anytime Graph-Based Path Planning with Deep Reinforcement Learning for an Autonomous UAV". Applied Sciences 11, n.º 9 (27 de abril de 2021): 3948. http://dx.doi.org/10.3390/app11093948.

Texto completo

Resumen

Path planning algorithms are of paramount importance in guidance and collision systems to provide trustworthiness and safety for operations of autonomous unmanned aerial vehicles (UAV). Previous works showed different approaches mostly focusing on shortest path discovery without a sufficient consideration on local planning and collision avoidance. In this paper, we propose a hybrid path planning algorithm that uses an anytime graph-based path planning algorithm for global planning and deep reinforcement learning for local planning which applied for a real-time mission planning system of an autonomous UAV. In particular, we aim to achieve a highly autonomous UAV mission planning system that is adaptive to real-world environments consisting of both static and moving obstacles for collision avoidance capabilities. To achieve adaptive behavior for real-world problems, a simulator is required that can imitate real environments for learning. For this reason, the simulator must be sufficiently flexible to allow the UAV to learn about the environment and to adapt to real-world conditions. In our scheme, the UAV first learns about the environment via a simulator, and only then is it applied to the real-world. The proposed system is divided into two main parts: optimal flight path generation and collision avoidance. A hybrid path planning approach is developed by combining a graph-based path planning algorithm with a learning-based algorithm for local planning to allow the UAV to avoid a collision in real time. The global path planning problem is solved in the first stage using a novel anytime incremental search algorithm called improved Anytime Dynamic A* (iADA*). A reinforcement learning method is used to carry out local planning between waypoints, to avoid any obstacles within the environment. The developed hybrid path planning system was investigated and validated in an AirSim environment. A number of different simulations and experiments were performed using AirSim platform in order to demonstrate the effectiveness of the proposed system for an autonomous UAV. This study helps expand the existing research area in designing efficient and safe path planning algorithms for UAVs.

Los estilos APA, Harvard, Vancouver, ISO, etc.

42

Civetta, Joseph M. y Charles L. Fox. "Advantages of Resuscitation with Balanced Hypertonic Sodium Solution in Disasters". Prehospital and Disaster Medicine 1, S1 (1985): 179–80. http://dx.doi.org/10.1017/s1049023x0004437x.

Texto completo

Resumen

Resuscitation in disasters must be effective, prompt, safe and uncomplicated. Clinical experience in severe, extensive thermal burns in numerous clinics has shown that balanced hypertonic sodium solution (BHSS) can achieve effective resuscitation with: administration of less volume of fluid; early onset of excretion of sodium-containing urine; less generalized edema and without pulmonary edema. This experience is now being transferred to patients after trauma and major surgical procedures often complicated by peritonitis. In an ongoing study of randomly selected adults following surgical trauma, either Ringer's lactate (RL) or the BHSS (0.9% NaCl plus 100ml of one molar sodium acetate, total 1100ml yielding Na230, Cl 140, acetate 90mEq/l) was administered. All patients received daily (or more frequent) electrolyte and osmotic analyses of plasma and urine, continuous ICU monitoring of pulmonary and cardiac function, and similar wound care.

Los estilos APA, Harvard, Vancouver, ISO, etc.

43

Wysocka, B. A., Z. Kassam, G. Lockwood, J. Brierley, L. Dawson y J. Ringash. "Assessment of intra and interfractional organ motion during adjuvant radiochemotherapy in gastric cancer". Journal of Clinical Oncology 25, n.º 18_suppl (20 de junio de 2007): 15132. http://dx.doi.org/10.1200/jco.2007.25.18_suppl.15132.

Texto completo

Resumen

15132 Background: Adjuvant combined chemotherapy and radiotherapy (RT) in gastric cancer improves survival, however acute toxicity is substantial. Toxicity may be improved with three-dimensional (3D) RT, but organ motion must be considered in planning target volume (PTV) delineation. Methods: Participants (n=22) had baseline free breathing planning CT (CT0) with BodyFix immobilization. Abdominal CTs in free breathing (FB), inhale (I) and exhale (E) states were obtained in weeks 1, 3 and 5 of RT. Datasets were fused to CT0 in Pinnacle3 6.2 planning system using bone registration. Volumes of interest (VOIs) [right (R) and left (L) kidney, liver, stomach, pancreas, celiac axis and porta hepatis] were contoured and points of interest (POIs) were placed at each centre of mass. POIs were manually placed at the left dome of diaphragm and splenic hilum. Organ motion was determined by the difference between POI positions in cranial-caudal (CC), anterior-posterior (AP) and right-left (RL) directions. Maximal respiratory motion was determined from the difference between I and E positions. Interfractional displacement in organs relative to bones at weeks 1, 3 and 5 was determined on FB scans as compared to baseline. Results: Interfractional organ motion was maximal in CC direction with median absolute displacements (range) in mm of: splenic hilum 10 (0–52), stomach 8 (0.4–27.2), liver 7.4 (0.5–23.6), diaphragm 6 (0–28), L kidney 5.7 (0–37.3), R kidney 5.3 (0.2–35.3), pancreas 5.7 (0.3–29.1), porta hepatis 4 (0–14) and celiac axis 1.7 (0–9.1). Median interfraction displacement (range) in CC, AP and RL in mm for all organs was: 5.7 (0- 52), 2.1 (0–23.1), 2.3 (0–15.9). Positional difference between I and E state (median for all organs) was: 16 mm CC, 5.9 mm AP, and 1.7 mm RL with maximal individual breathing excursions of 59.9, 30.2 and 21.1 mm, respectively. Conclusions: Interfraction organ displacement relative to bones can be quantified and used in the safe design of 3D conformal radiotherapy. Respiratory motion can be substantial in some individuals. Accounting for organ motion in 3D RT planning is necessary and may reduce the toxicity of treatment. No significant financial relationships to disclose.

Los estilos APA, Harvard, Vancouver, ISO, etc.

44

Niu, Tong y Mohit Bansal. "AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 05 (3 de abril de 2020): 8560–67. http://dx.doi.org/10.1609/aaai.v34i05.6378.

Texto completo

Resumen

Many sequence-to-sequence dialogue models tend to generate safe, uninformative responses. There have been various useful efforts on trying to eliminate them. However, these approaches either improve decoding algorithms during inference, rely on hand-crafted features, or employ complex models. In our work, we build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering. Specifically, we start with a simple yet effective automatic metric, AvgOut, which calculates the average output probability distribution of all time steps on the decoder side during training. This metric directly estimates which tokens are more likely to be generated, thus making it a faithful evaluation of the model diversity (i.e., for diverse models, the token probabilities should be more evenly distributed rather than peaked at a few dull tokens). We then leverage this novel metric to propose three models that promote diversity without losing relevance. The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch; the second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level; the third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal. Moreover, we experiment with a hybrid model by combining the loss terms of MinAvgOut and RL. All four models outperform their base LSTM-RNN model on both diversity and relevance by a large margin, and are comparable to or better than competitive baselines (also verified via human evaluation). Moreover, our approaches are orthogonal to the base model, making them applicable as an add-on to other emerging better dialogue models in the future.

Los estilos APA, Harvard, Vancouver, ISO, etc.

45

Vivek, Kumar, Shah Amiti, Saha Shivshankar y Choudhary Lalit. "Electrolyte and Haemogram changes post large volume liposuction comparing two different tumescent solutions". Indian Journal of Plastic Surgery 47, n.º 03 (septiembre de 2014): 386–93. http://dx.doi.org/10.4103/0970-0358.146604.

Texto completo

Resumen

ABSTRACT Background: The most common definitions of large volume liposuction refer to total 5 l volume aspiration during a single procedure (fat plus wetting solution). Profound haemodynamic and metabolic alterations can accompany large volume liposuction. Due to paucity of literature on the effect of different tumescent solutions on the electrolyte balance and haematological changes during large volume liposuction, we carried out this study using two different wetting solutions to study the same. Materials and Methods: Total 30 patients presenting with varying degrees of localized lipodystrophy in different body regions were enrolled for the study. Prospective randomized controlled trial was conducted by Department of Plastic and Cosmetic Surgery, Sir Ganga Ram Hospital, New Delhi from January 2011 to June 2012. Patients were randomized into two groups of 15 patients each by using computer generated random numbers. Tumescent formula used for Group A (normal saline [NS]) was our modification of Klein’s Formula and Tumescent formula used for Group B (ringer lactate [RL]) was our modification of Hunstadt’s formula. Serum electrolytes and hematocrit levels were done at preinduction, immediate postoperative period and postoperative day 1. Result: Statistical analysis was performed using SPSS software version 15.0. Which showed statistically significant electrolytes and hematocrit changes occur during large volume liposuction. Conclusion: Statistically significant electrolytes and hematocrit changes occur during large volume liposuction and patients should be kept under observation of anaesthesist for at least 24 h. Patients require strict monitoring of vital parameters and usually Intensive Care Unit is not required. There was no statistical difference in the electrolyte changes using NS or RL as tumescent solution and both solutions were found safe for large volume liposuction.

Los estilos APA, Harvard, Vancouver, ISO, etc.

46

Chebbi, Alif, Massimiliano Tazzari, Cristiana Rizzi, Franco Hernan Gomez Tovar, Sara Villa, Silvia Sbaffoni, Mentore Vaccari y Andrea Franzetti. "Burkholderia thailandensis E264 as a promising safe rhamnolipids’ producer towards a sustainable valorization of grape marcs and olive mill pomace". Applied Microbiology and Biotechnology 105, n.º 9 (20 de abril de 2021): 3825–42. http://dx.doi.org/10.1007/s00253-021-11292-0.

Texto completo

Resumen

Abstract Within the circular economy framework, our study aims to assess the rhamnolipid production from winery and olive oil residues as low-cost carbon sources by nonpathogenic strains. After evaluating various agricultural residues from those two sectors, Burkholderia thailandensis E264 was found to use the raw soluble fraction of nonfermented (white) grape marcs (NF), as the sole carbon and energy source, and simultaneously, reducing the surface tension to around 35 mN/m. Interestingly, this strain showed a rhamnolipid production up to 1070 mg/L (13.37 mg/g of NF), with a higher purity, on those grape marcs, predominately Rha-Rha C14-C14, in MSM medium. On olive oil residues, the rhamnolipid yield of using olive mill pomace (OMP) at 2% (w/v) was around 300 mg/L (15 mg/g of OMP) with a similar CMC of 500 mg/L. To the best of our knowledge, our study indicated for the first time that a nonpathogenic bacterium is able to produce long-chain rhamnolipids in MSM medium supplemented with winery residues, as sole carbon and energy source. Key points • Winery and olive oil residues are used for producing long-chain rhamnolipids (RLs). • Both higher RL yields and purity were obtained on nonfermented grape marcs as substrates. • Long-chain RLs revealed stabilities over a wide range of pH, temperatures, and salinities

Los estilos APA, Harvard, Vancouver, ISO, etc.

47

Brown, Jennifer R., Matthew S. Davids, Jordi Rodon, Pau Abrisqueta, Coumaran Egile, Rodrigo Ruiz-Soto y Farrukh Awan. "Update On The Safety and Efficacy Of The Pan Class I PI3K Inhibitor SAR245408 (XL147) In Chronic Lymphocytic Leukemia and Non-Hodgkin’s Lymphoma Patients". Blood 122, n.º 21 (15 de noviembre de 2013): 4170. http://dx.doi.org/10.1182/blood.v122.21.4170.4170.

Texto completo

Resumen

Abstract Background Constitutive activation of phosphatidylinositol 3-kinase (PI3K)/ mammalian target of rapamycin (mTOR) pathway by various mechanisms has been implicated in the pathogenesis of chronic lymphocytic leukemia (CLL) and non-Hodgkin’s lymphoma (NHL). There is mounting evidence suggesting that along with PI3Kγ, PI3Kα may be involved in CLL/NHL. SAR245408 is a potent and selective inhibitor of all α, γ and δ class I PI3K isoforms. It has been shown to inhibit PI3K signaling and impact tumor growth in preclinical tumor models. The impact of SAR245408 on safety, tolerability, pharmacokinetics, pharmacodynamics and anti-tumor effect was evaluated in patients with relapsed/refractory CLL and NHL from the Sanofi sponsored phase 1 single-agent study (NCT00486135) [Brown et al. ASH Annual Meeting Abstracts 2011. 118 (21): 2683]. Methods SAR245408 was administered orally, once daily, with continuous dosing in monthly cycles. A total of 25 patients were enrolled at the maximum tolerated dose identified in solid tumor patients as part of the expansion cohort in relapsed/refractory lymphoproliferative malignancies. Plasma cytokines and chemokines were evaluated using the Myriad RBM Human Discovery MAP250+ panel (> 250 analytes) and ELISA assays. Results Among the 25 patients (pts), 40% (n=10) had refractory CLL and 60% (n=15) had various relapsed/refractory lymphomas (R/RL), including follicular lymphoma (FL) (n=5), diffuse large B-cell lymphoma (DLBCL) (n=4), Waldenstrom's macroglobulinemia (WM) (n=3), Hodgkin lymphoma (n=2) and B-cell prolymphocytic leukemia (n=1). The median age was 65 years (range 28–83), and 56% were female. Eighty percent of pts had stage III-IV disease and 48% had bulky disease. Five pts were categorized as having refractory disease and the median number of prior regimens was 1 (range 1-7) for CLL pts and 3 (range 0-9) for R/RL pts. For all 25 patients, the median starting absolute lymphocyte count was 1.15 x 103/μL (range 0.3–37.2), the median starting hemoglobin was 11.4 g/dL (range 9.1–15) and the median platelet count was 162 x 103/μL (range 60–431). The median number of cycles administered in CLL pts was 10 (range 5-24) and 5 (range 1-26) in R/RL pts. Six CLL pts experienced an increase in absolute lymphocyte counts within 2-4 weeks of treatment, as has been seen with other PI3K inhibitors. According to modified International Workshop on Chronic Lymphocytic Leukemia (IWCLL) response criteria, 4 CLL pts had partial response (PR), 1 had nodal PR with increased lymphocytosis, and 5 had stable disease (SD); the progression free survival (PFS) in the responding pts was 22, 21.2, 15.6 and 15.4 months while in the SD pts was 12.5, 9.2, 7.4, 5.6, 4.6 and 3.6 months. According to the modified International Working Group response criteria, 3 PRs were reported in R/RL pts [1 WM, 1 DLBCL (transformed) and one FL with PFS of 23.7, 18.4 and PFS 4.8 months respectively]. One hundred percent of CLL and 80% of R/RL pts reported grade 3 or higher AEs, with the most common (≥ 10% of patients) including neutropenia, diarrhea, anemia and hypotension. SAR245408 induced a reduction in levels of chemokines involved in lymphocyte trafficking in CLL subjects (n=8), including CXCL13, CCL3, CCL22 and CCL19 (64, 58, 52 and 54% reduction, respectively, p<0.05) similar to what was reported with PI3K δ specific inhibitors. In addition, a reduction of tumor necrosis factor receptor 2 (TNFR2) and interleukin-2 receptor alpha (IL-2Rα) levels was observed (63% reduction each, p<0.01). Conclusions The recommended phase 2 dose of SAR245408 in solid tumor patients was confirmed as safe and tolerable in patients with CLL and R/RL. Single agent SAR245408 demonstrates clinical activity in patients with relapsed or refractory CLL and promising pharmacodynamic effects on chemokine levels involved in lymphocyte trafficking. Disclosures: Brown: Emergent: Consultancy; Onyx: Consultancy; Sanofi Aventis: Consultancy; Vertex: Consultancy; Novartis: Consultancy; Genzyme: Research Funding; Avila: Consultancy; Celgene: Consultancy, Research Funding; Genentech: Consultancy; Pharmacyclics: Consultancy. Off Label Use: The abstract shows scientific information on SAR245408 which is an investigational product developed by Sanofi. This investigational product is not approved by any health authority for any indication. Egile:Sanofi: Employment. Ruiz-Soto:Sanofi: Employment. Awan:Lymphoma Research Foundation: Research Funding; Spectrum Pharmaceuticals Inc.: Speakers Bureau.

Los estilos APA, Harvard, Vancouver, ISO, etc.

48

Tripathi, Malati, Ayushma Adhikari y Bibhushan Neupane. "Misoprostol Versus Oxytocin for Induction of Labour at Term and Post Term Pregnancy of Primigravida". Journal of Universal College of Medical Sciences 6, n.º 2 (3 de diciembre de 2018): 56–59. http://dx.doi.org/10.3126/jucms.v6i2.22497.

Texto completo

Resumen

Introduction: To compare effectiveness and safety of sublingually administered misoprostol and intravenously infused 10 units of oxytocin for labor induction at term and post term pregnant women in Gandaki Medical College Teaching Hospital (GMCTH). Materials and methods: This is a prospective study conducted in Department of Obstetrics and Gynaecology in Gandaki Medical College and performed on 120 patients of primigravida with cephalic presentation at term and post-term pregnancy. Patients were given 50µg sublingual misoprostol 6 hourly (two doses) and 5 units of oxytocin in 500ml RL started from 10 drops up to 60 drops till effective contraction occur with maximum of 10 units oxytocin. Maternal and fetal outcomes were observed. Collected data were analyzed using SPSS and MS Excel. Results: There were no significant differences between the groups concerning time duration between inductions to delivery time, indications of caesarean section, different modes of delivery and for the Apgar score at one and five minutes. Conclusion: Both oxytocin and misoprostol are effective and safe for induction of labour.

Los estilos APA, Harvard, Vancouver, ISO, etc.

49

Olupot-Olupot, Peter, Florence Aloroker, Ayub Mpoya, Hellen Mnjalla, George Passi, Margaret Nakuya, Kirsty Houston et al. "Gastroenteritis Rehydration Of children with Severe Acute Malnutrition (GASTROSAM): A Phase II Randomised Controlled trial: Trial Protocol". Wellcome Open Research 6 (23 de junio de 2021): 160. http://dx.doi.org/10.12688/wellcomeopenres.16885.1.

Texto completo

Resumen

Background: Children hospitalised with severe acute malnutrition (SAM) are frequently complicated (>50%) by diarrhoea (≥3 watery stools/day) which is accompanied by poor outcomes. Rehydration guidelines for SAM are exceptionally conservative and controversial, based upon expert opinion. The guidelines only permit use of intravenous fluids for cases with advanced shock and exclusive use of low sodium intravenous and oral rehydration solutions (ORS) for fear of fluid and/or sodium overload. Children managed in accordance to these guidelines have a very high mortality. The proposed GASTROSAM trial is the first step in reappraising current recommendations. We hypothesize that liberal rehydration strategies for both intravenous and oral rehydration in SAM children with diarrhoea may reduce adverse outcomes. Methods An open Phase II trial, with a partial factorial design, enrolling Ugandan and Kenyan children aged 6 months to 12 years with SAM hospitalised with gastroenteritis (>3 loose stools/day) and signs of moderate and severe dehydration. In Stratum A (severe dehydration) children will be randomised (1:1:2) to WHO plan C (100mls/kg Ringers Lactate (RL) with intravenous rehydration given over 3-6 hours according to age including boluses for shock), slow rehydration (100 mls/kg RL over 8 hours (no boluses)) or WHO SAM rehydration regime (ORS only (boluses for shock (standard of care)). Stratum B incorporates all children with moderate dehydration and severe dehydration post-intravenous rehydration and compares (1:1 ratio) standard WHO ORS given for non-SAM (experimental) versus WHO SAM-recommended low-sodium ReSoMal. The primary outcome for intravenous rehydration is urine output (mls/kg/hour at 8 hours post-randomisation), and for oral rehydration a change in sodium levels at 24 hours post-randomisation. This trial will also generate feasibility, safety and preliminary data on survival to 28 days. Discussion. If current rehydration strategies for non-malnourished children are safe in SAM this could prompt future evaluation in Phase III trials.

Los estilos APA, Harvard, Vancouver, ISO, etc.

50

Jiang, Jianhua, Yangang Ren, Yang Guan, Shengbo Eben Li, Yuming Yin, Dongjie Yu y Xiaoping Jin. "Integrated decision and control at multi-lane intersections with mixed traffic flow". Journal of Physics: Conference Series 2234, n.º 1 (1 de abril de 2022): 012015. http://dx.doi.org/10.1088/1742-6596/2234/1/012015.

Texto completo

Resumen

Abstract Autonomous driving at intersections is one of the most complicated and accident-prone traffic scenarios, especially with mixed traffic participants such as vehicles, bicycles and pedestrians. The driving policy should make safe decisions to handle the dynamic traffic conditions and meet the requirements of on-board computation. However, most of the current researches focuses on simplified intersections considering only the surrounding vehicles and idealized traffic lights. This paper improves the integrated decision and control framework and develops a learning-based algorithm to deal with complex intersections with mixed traffic flows, which can not only take account of realistic characteristics of traffic lights, but also learn a safe policy under different safety constraints. We first consider different velocity models for green and red lights in the training process and use a finite state machine to handle different modes of light transformation. Then we design different types of distance constraints for vehicles, traffic lights, pedestrians, bicycles respectively and formulize the constrained optimal control problems (OCPs) to be optimized. Finally, reinforcement learning (RL) with value and policy networks is adopted to solve the series of OCPs. In order to verify the safety and efficiency of the proposed method, we design a multi-lane intersection with the existence of large-scale mixed traffic participants and set practical traffic light phases. The simulation results indicate that the trained decision and control policy can well balance safety and tracking performance. Compared with model predictive control (MPC), the computational time is three orders of magnitude lower.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!