Segui questo link per vedere altri tipi di pubblicazioni sul tema: Reinforcement learning (Machine learning).

Articoli di riviste sul tema "Reinforcement learning (Machine learning)"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Reinforcement learning (Machine learning)".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

Ishii, Shin, e Wako Yoshida. "Part 4: Reinforcement learning: Machine learning and natural learning". New Generation Computing 24, n. 3 (settembre 2006): 325–50. http://dx.doi.org/10.1007/bf03037338.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Wang, Zizhuang. "Temporal-Related Convolutional-Restricted-Boltzmann-Machine Capable of Learning Relational Order via Reinforcement Learning Procedure". International Journal of Machine Learning and Computing 7, n. 1 (febbraio 2017): 1–8. http://dx.doi.org/10.18178/ijmlc.2017.7.1.610.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Butlin, Patrick. "Machine Learning, Functions and Goals". Croatian journal of philosophy 22, n. 66 (27 dicembre 2022): 351–70. http://dx.doi.org/10.52685/cjp.22.66.5.

Testo completo
Abstract (sommario):
Machine learning researchers distinguish between reinforcement learning and supervised learning and refer to reinforcement learning systems as “agents”. This paper vindicates the claim that systems trained by reinforcement learning are agents while those trained by supervised learning are not. Systems of both kinds satisfy Dretske’s criteria for agency, because they both learn to produce outputs selectively in response to inputs. However, reinforcement learning is sensitive to the instrumental value of outputs, giving rise to systems which exploit the effects of outputs on subsequent inputs to achieve good performance over episodes of interaction with their environments. Supervised learning systems, in contrast, merely learn to produce better outputs in response to individual inputs.
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Martín-Guerrero, José D., e Lucas Lamata. "Reinforcement Learning and Physics". Applied Sciences 11, n. 18 (16 settembre 2021): 8589. http://dx.doi.org/10.3390/app11188589.

Testo completo
Abstract (sommario):
Machine learning techniques provide a remarkable tool for advancing scientific research, and this area has significantly grown in the past few years. In particular, reinforcement learning, an approach that maximizes a (long-term) reward by means of the actions taken by an agent in a given environment, can allow one for optimizing scientific discovery in a variety of fields such as physics, chemistry, and biology. Morover, physical systems, in particular quantum systems, may allow one for more efficient reinforcement learning protocols. In this review, we describe recent results in the field of reinforcement learning and physics. We include standard reinforcement learning techniques in the computer science community for enhancing physics research, as well as the more recent and emerging area of quantum reinforcement learning, inside quantum machine learning, for improving reinforcement learning computations.
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Liu, Yicen, Yu Lu, Xi Li, Wenxin Qiao, Zhiwei Li e Donghao Zhao. "SFC Embedding Meets Machine Learning: Deep Reinforcement Learning Approaches". IEEE Communications Letters 25, n. 6 (giugno 2021): 1926–30. http://dx.doi.org/10.1109/lcomm.2021.3061991.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Popkov, Yuri S., Yuri A. Dubnov e Alexey Yu Popkov. "Reinforcement Procedure for Randomized Machine Learning". Mathematics 11, n. 17 (23 agosto 2023): 3651. http://dx.doi.org/10.3390/math11173651.

Testo completo
Abstract (sommario):
This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. The dependences of the dimensions of the neighborhood of the global minimum and the probability of its achievement on the parameters of the algorithm are determined. The convergence of the algorithm with the indicated probability to the neighborhood of the global minimum is proved.
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Crawford, Daniel, Anna Levit, Navid Ghadermarzy, Jaspreet S. Oberoi e Pooya Ronagh. "Reinforcement learning using quantum Boltzmann machines". Quantum Information and Computation 18, n. 1&2 (febbraio 2018): 51–74. http://dx.doi.org/10.26421/qic18.1-2-3.

Testo completo
Abstract (sommario):
We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. In absence of a transverse field, our simulations show that DBMs are trained more effectively than restricted Boltzmann machines (RBM) with the same number of nodes. We then develop a framework for training the network as a quantum Boltzmann machine (QBM) in the presence of a significant transverse field for reinforcement learning. This method also outperforms the reinforcement learning method that uses RBMs.
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Lamata, Lucas. "Quantum Reinforcement Learning with Quantum Photonics". Photonics 8, n. 2 (28 gennaio 2021): 33. http://dx.doi.org/10.3390/photonics8020033.

Testo completo
Abstract (sommario):
Quantum machine learning has emerged as a promising paradigm that could accelerate machine learning calculations. Inside this field, quantum reinforcement learning aims at designing and building quantum agents that may exchange information with their environment and adapt to it, with the aim of achieving some goal. Different quantum platforms have been considered for quantum machine learning and specifically for quantum reinforcement learning. Here, we review the field of quantum reinforcement learning and its implementation with quantum photonics. This quantum technology may enhance quantum computation and communication, as well as machine learning, via the fruitful marriage between these previously unrelated fields.
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Sahu, Santosh Kumar, Anil Mokhade e Neeraj Dhanraj Bokde. "An Overview of Machine Learning, Deep Learning, and Reinforcement Learning-Based Techniques in Quantitative Finance: Recent Progress and Challenges". Applied Sciences 13, n. 3 (2 febbraio 2023): 1956. http://dx.doi.org/10.3390/app13031956.

Testo completo
Abstract (sommario):
Forecasting the behavior of the stock market is a classic but difficult topic, one that has attracted the interest of both economists and computer scientists. Over the course of the last couple of decades, researchers have investigated linear models as well as models that are based on machine learning (ML), deep learning (DL), reinforcement learning (RL), and deep reinforcement learning (DRL) in order to create an accurate predictive model. Machine learning algorithms can now extract high-level financial market data patterns. Investors are using deep learning models to anticipate and evaluate stock and foreign exchange markets due to the advantage of artificial intelligence. Recent years have seen a proliferation of the deep reinforcement learning algorithm’s application in algorithmic trading. DRL agents, which combine price prediction and trading signal production, have been used to construct several completely automated trading systems or strategies. Our objective is to enable interested researchers to stay current and easily imitate earlier findings. In this paper, we have worked to explain the utility of Machine Learning, Deep Learning, Reinforcement Learning, and Deep Reinforcement Learning in Quantitative Finance (QF) and the Stock Market. We also outline potential future study paths in this area based on the overview that was presented before.
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Fang, Qiang, Wenzhuo Zhang e Xitong Wang. "Visual Navigation Using Inverse Reinforcement Learning and an Extreme Learning Machine". Electronics 10, n. 16 (18 agosto 2021): 1997. http://dx.doi.org/10.3390/electronics10161997.

Testo completo
Abstract (sommario):
In this paper, we focus on the challenges of training efficiency, the designation of reward functions, and generalization in reinforcement learning for visual navigation and propose a regularized extreme learning machine-based inverse reinforcement learning approach (RELM-IRL) to improve the navigation performance. Our contributions are mainly three-fold: First, a framework combining extreme learning machine with inverse reinforcement learning is presented. This framework can improve the sample efficiency and obtain the reward function directly from the image information observed by the agent and improve the generation for the new target and the new environment. Second, the extreme learning machine is regularized by multi-response sparse regression and the leave-one-out method, which can further improve the generalization ability. Simulation experiments in the AI-THOR environment showed that the proposed approach outperformed previous end-to-end approaches, thus, demonstrating the effectiveness and efficiency of our approach.
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Sarikhani, Rahil, e Farshid Keynia. "Cooperative Spectrum Sensing Meets Machine Learning: Deep Reinforcement Learning Approach". IEEE Communications Letters 24, n. 7 (luglio 2020): 1459–62. http://dx.doi.org/10.1109/lcomm.2020.2984430.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
12

AlDahoul, Nouar, Zaw Zaw Htike e Rini Akmeliawati. "Hierarchical extreme learning machine based reinforcement learning for goal localization". IOP Conference Series: Materials Science and Engineering 184 (marzo 2017): 012055. http://dx.doi.org/10.1088/1757-899x/184/1/012055.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
13

McPartland, Michelle, e Marcus Gallagher. "Learning to be a Bot: Reinforcement Learning in Shooter Games". Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 4, n. 1 (27 settembre 2021): 78–83. http://dx.doi.org/10.1609/aiide.v4i1.18676.

Testo completo
Abstract (sommario):
This paper demonstrates the applicability of reinforcement learning for first person shooter bot artificial intelligence. Reinforcement learning is a machine learning technique where an agent learns a problem through interaction with the environment. The Sarsa(λ) algorithm will be applied to a first person shooter bot controller to learn the tasks of (1) navigation and item collection, and (2) combat. The results will show the validity and diversity of reinforcement learning in a first person shooter environment.
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Zine, Mohamed, Fouzi Harrou, Mohammed Terbeche, Mohammed Bellahcene, Abdelkader Dairi e Ying Sun. "E-Learning Readiness Assessment Using Machine Learning Methods". Sustainability 15, n. 11 (1 giugno 2023): 8924. http://dx.doi.org/10.3390/su15118924.

Testo completo
Abstract (sommario):
Assessing e-learning readiness is crucial for educational institutions to identify areas in their e-learning systems needing improvement and to develop strategies to enhance students’ readiness. This paper presents an effective approach for assessing e-learning readiness by combining the ADKAR model and machine learning-based feature importance identification methods. The motivation behind using machine learning approaches lies in their ability to capture nonlinearity in data and flexibility as data-driven models. This study surveyed faculty members and students in the Economics faculty at Tlemcen University, Algeria, to gather data based on the ADKAR model’s five dimensions: awareness, desire, knowledge, ability, and reinforcement. Correlation analysis revealed a significant relationship between all dimensions. Specifically, the pairwise correlation coefficients between readiness and awareness, desire, knowledge, ability, and reinforcement are 0.5233, 0.5983, 0.6374, 0.6645, and 0.3693, respectively. Two machine learning algorithms, random forest (RF) and decision tree (DT), were used to identify the most important ADKAR factors influencing e-learning readiness. In the results, ability and knowledge were consistently identified as the most significant factors, with scores of ability (0.565, 0.514) and knowledge (0.170, 0.251) using RF and DT algorithms, respectively. Additionally, SHapley Additive exPlanations (SHAP) values were used to explore further the impact of each variable on the final prediction, highlighting ability as the most influential factor. These findings suggest that universities should focus on enhancing students’ abilities and providing them with the necessary knowledge to increase their readiness for e-learning. This study provides valuable insights into the factors influencing university students’ e-learning readiness.
Gli stili APA, Harvard, Vancouver, ISO e altri
15

Zhang, Ziyu. "Basic things about reinforcement learning". Applied and Computational Engineering 6, n. 1 (14 giugno 2023): 280–84. http://dx.doi.org/10.54254/2755-2721/6/20230788.

Testo completo
Abstract (sommario):
Artificial Intelligence has been a very popular topic at present, machine learning is also one of the main algorithms in AI, which consisted of Supervised learning, Unsupervised learning and Reinforcement learning, and Supervised learning and Unsupervised learning have been relatively mature. Reinforcement learning technology has a long history, it wasn't until the late '80s and early' 90s that reinforcement learning became widely used in artificial intelligence, machine learning. Generally, Reinforcement learning is a process of trial and error, agent will choose to make an action according to the feedback from the environment, this step will repeat a lot of times until it find the best policy, mapping it to reality, it can help human to fulfill some missions which are nearly impossible before. However, there is still some potential problems in Reinforcement learning. This essay compares some basic algorithms related to RL to help reader to have a basic understanding of RL and propose some exsiting defects about it.
Gli stili APA, Harvard, Vancouver, ISO e altri
16

Yang, Yanxiang, Jiang Hu, Dana Porter, Thomas Marek, Kevin Heflin e Hongxin Kong. "Deep Reinforcement Learning-Based Irrigation Scheduling". Transactions of the ASABE 63, n. 3 (2020): 549–56. http://dx.doi.org/10.13031/trans.13633.

Testo completo
Abstract (sommario):
Highlights Deep reinforcement learning-based irrigation scheduling is proposed to determine the amount of irrigation required at each time step considering soil moisture level, evapotranspiration, forecast precipitation, and crop growth stage. The proposed methodology was compared with traditional irrigation scheduling approaches and some machine learning based scheduling approaches based on simulation. Abstract. Machine learning has been widely applied in many areas, with promising results and large potential. In this article, deep reinforcement learning-based irrigation scheduling is proposed. This approach can automate the irrigation process and can achieve highly precise water application that results in higher simulated net return. Using this approach, the irrigation controller can automatically determine the optimal or near-optimal water application amount. Traditional reinforcement learning can be superior to traditional periodic and threshold-based irrigation scheduling. However, traditional reinforcement learning fails to accurately represent a real-world irrigation environment due to its limited state space. Compared with traditional reinforcement learning, the deep reinforcement learning method can better model a real-world environment based on multi-dimensional observations. Simulations for various weather conditions and crop types show that the proposed deep reinforcement learning irrigation scheduling can increase net return. Keywords: Automated irrigation scheduling, Deep reinforcement learning, Machine learning.
Gli stili APA, Harvard, Vancouver, ISO e altri
17

DiGiovanna, J., B. Mahmoudi, J. Fortes, J. C. Principe e J. C. Sanchez. "Coadaptive Brain–Machine Interface via Reinforcement Learning". IEEE Transactions on Biomedical Engineering 56, n. 1 (gennaio 2009): 54–64. http://dx.doi.org/10.1109/tbme.2008.926699.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
18

NAKATANI, Masayuki, Zeyuan SUN e Yutaka UCHIMURA. "Intelligent Construction Machine by Deep Reinforcement Learning". Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2017 (2017): 2P2—G03. http://dx.doi.org/10.1299/jsmermd.2017.2p2-g03.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Mahdi, Hiba. "Blockchain and Machine Learning as Deep Reinforcement". Wasit Journal of Computer and Mathematics Science 2, n. 1 (31 marzo 2023): 72–84. http://dx.doi.org/10.31185/wjcm.103.

Testo completo
Abstract (sommario):
Due to its capacity to make wise decisions, deep learning has become extremely popular in recent years. The current generation of deep learning, which heavily rely centralized servers, are unable to offer attributes like operational transparency, stability, security, and reliable data provenance. Additionally, Single point of failure is a problem that deep learning designs are susceptible since they need centralized data to train them. We review the body of research on the application of deep learning to blockchain. We categorize and arrange the literature for developing topic taxonomy based their criteria: Application domain, deep learning-specific consensus mechanisms, goals for deployment and blockchain type. To facilitate meaningful discussions, we list the benefits and drawbacks of the most cutting-edge blockchain-based deep learning frameworks.
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Hoshino, Yukinobu, e Katsuari Kamei. "Effective Use of Learning Knowledge by FEERL". Journal of Advanced Computational Intelligence and Intelligent Informatics 7, n. 1 (20 febbraio 2003): 6–9. http://dx.doi.org/10.20965/jaciii.2003.p0006.

Testo completo
Abstract (sommario):
The machine learning is proposed to learning techniques of spcialists. A machine has to learn techniques by trial and error when there are no training examples. Reinforcement learning is a powerful machine learning system, which is able to learn without giving training examples to a learning unit. But it is impossible for the reinforcement learning to support large environments because the number of if-then rules is a huge combination of a relationship between one environment and one action. We have proposed new reinforcement learning system for the large environment, Fuzzy Environment Evaluation Reinforcement Learning (FEERL). In this paper, we proposed to reuse of the acquired rules by FEERL.
Gli stili APA, Harvard, Vancouver, ISO e altri
21

Jadhav, Rutuja. "Tracking Locomotion using Reinforcement Learning". International Journal for Research in Applied Science and Engineering Technology 10, n. 7 (31 luglio 2022): 1777–83. http://dx.doi.org/10.22214/ijraset.2022.45509.

Testo completo
Abstract (sommario):
Abstract: This article presents the concept of reinforcement learning, which prepares a static direct approach for consistent control problems, and adjusts cutting-edge techniques for testing effectiveness in benchmark Mujoco locomotion tasks. This model was designed and developed to use the Mujoco Engine to track the movement of robotic structures and eliminate problems with assessment calculations using perceptron’s and random search algorithms. Here, the machine learning model is trained to make a series of decisions. The humanoid model is considered to be one of the most difficult and ongoing problems to solve by applying state-of-the-art RL technology. The field of machine learning has a great influence on the training model of the RL environment. Here we use random seed values to provide continuous input to achieve optimized results. The goal of this project is to use the Mujoco engine in a specific context to automatically determine the ideal behavior of the robot in an augmented reality environment. Enhanced random search was introduced to train linear guidelines for achieving the efficiency of Mujoco roaming tasks. The results of these models highlight the variability of the Mujoco benchmark task and lead to efficiently optimized rewards
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Eckardt, Jan-Niklas, Karsten Wendt, Martin Bornhäuser e Jan Moritz Middeke. "Reinforcement Learning for Precision Oncology". Cancers 13, n. 18 (15 settembre 2021): 4624. http://dx.doi.org/10.3390/cancers13184624.

Testo completo
Abstract (sommario):
Precision oncology is grounded in the increasing understanding of genetic and molecular mechanisms that underly malignant disease and offer different treatment pathways for the individual patient. The growing complexity of medical data has led to the implementation of machine learning techniques that are vastly applied for risk assessment and outcome prediction using either supervised or unsupervised learning. Still largely overlooked is reinforcement learning (RL) that addresses sequential tasks by exploring the underlying dynamics of an environment and shaping it by taking actions in order to maximize cumulative rewards over time, thereby achieving optimal long-term outcomes. Recent breakthroughs in RL demonstrated remarkable results in gameplay and autonomous driving, often achieving human-like or even superhuman performance. While this type of machine learning holds the potential to become a helpful decision support tool, it comes with a set of distinctive challenges that need to be addressed to ensure applicability, validity and safety. In this review, we highlight recent advances of RL focusing on studies in oncology and point out current challenges and pitfalls that need to be accounted for in future studies in order to successfully develop RL-based decision support systems for precision oncology.
Gli stili APA, Harvard, Vancouver, ISO e altri
23

Kaelbling, L. P., M. L. Littman e A. W. Moore. "Reinforcement Learning: A Survey". Journal of Artificial Intelligence Research 4 (1 maggio 1996): 237–85. http://dx.doi.org/10.1613/jair.301.

Testo completo
Abstract (sommario):
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Tizhoosh, Hamid R. "Opposition-Based Reinforcement Learning". Journal of Advanced Computational Intelligence and Intelligent Informatics 10, n. 4 (20 luglio 2006): 578–85. http://dx.doi.org/10.20965/jaciii.2006.p0578.

Testo completo
Abstract (sommario):
Reinforcement learning is a machine intelligence scheme for learning in highly dynamic, probabilistic environments. By interaction with the environment, reinforcement agents learn optimal control policies, especially in the absence of a priori knowledge and/or a sufficiently large amount of training data. Despite its advantages, however, reinforcement learning suffers from a major drawback - high calculation cost because convergence to an optimal solution usually requires that all states be visited frequently to ensure that policy is reliable. This is not always possible, however, due to the complex, high-dimensional state space in many applications. This paper introduces opposition-based reinforcement learning, inspired by opposition-based learning, to speed up convergence. Considering opposite actions simultaneously enables individual states to be updated more than once shortening exploration and expediting convergence. Three versions of Q-learning algorithm will be given as examples. Experimental results for the grid world problem of different sizes demonstrate the superior performance of the proposed approach.
Gli stili APA, Harvard, Vancouver, ISO e altri
25

Meng, Terry Lingze, e Matloob Khushi. "Reinforcement Learning in Financial Markets". Data 4, n. 3 (28 luglio 2019): 110. http://dx.doi.org/10.3390/data4030110.

Testo completo
Abstract (sommario):
Recently there has been an exponential increase in the use of artificial intelligence for trading in financial markets such as stock and forex. Reinforcement learning has become of particular interest to financial traders ever since the program AlphaGo defeated the strongest human contemporary Go board game player Lee Sedol in 2016. We systematically reviewed all recent stock/forex prediction or trading articles that used reinforcement learning as their primary machine learning method. All reviewed articles had some unrealistic assumptions such as no transaction costs, no liquidity issues and no bid or ask spread issues. Transaction costs had significant impacts on the profitability of the reinforcement learning algorithms compared with the baseline algorithms tested. Despite showing statistically significant profitability when reinforcement learning was used in comparison with baseline models in many studies, some showed no meaningful level of profitability, in particular with large changes in the price pattern between the system training and testing data. Furthermore, few performance comparisons between reinforcement learning and other sophisticated machine/deep learning models were provided. The impact of transaction costs, including the bid/ask spread on profitability has also been assessed. In conclusion, reinforcement learning in stock/forex trading is still in its early development and further research is needed to make it a reliable method in this domain.
Gli stili APA, Harvard, Vancouver, ISO e altri
26

Senthil, Chandran, e Ranjitharamasamy Sudhakara Pandian. "Proactive Maintenance Model Using Reinforcement Learning Algorithm in Rubber Industry". Processes 10, n. 2 (14 febbraio 2022): 371. http://dx.doi.org/10.3390/pr10020371.

Testo completo
Abstract (sommario):
This paper presents an investigation into the enhancement of availability of a curing machine deployed in the rubber industry, located in Tamilnadu in India. Machine maintenance is a major task in the rubber industry, due to the demand for product. Critical component identification in curing machines is necessary to prevent rapid failure followed by subsequent repairs that extend curing machine downtime. A reward in the Reinforcement Learning Algorithm (RLA) prevents frequent downtime by improving the availability of the curing machine at time when unscheduled long-term maintenance would interfere with operation, due to the occurrence of unspecified failure to a critical component. Over time, depreciation and degradation of components in a machine are unavoidable, as is shown in the present investigation through intelligent assessment of the lifespan of components. So far, no effective methodology has been implemented in a real-time maintenance environment. RLAs seem to be a more effective application when it is based on intelligent assessment, which encompasses the failure and repair rate used to calculate availability in an automated environment. Training of RLAs is performed to evaluate overall equipment efficiency (OEE) in terms of availability. The availability of a curing machine in the form of state probability is modeled in first-order differential-difference equations. RLAs maximize the rate of availability of the machine. Preventive maintenance (PM) rate for four modules of 16 curing machines is expressed in a transition diagram, using transition rate. Transition rate indicates the degree of PM and unplanned maintenance rates that defines the total availability of the four modules. OEE is expressed in terms of the availability of curing machines, which is related to performance and quality. The results obtained by RLA are promising regarding short-term and long-term efficiencies of OEE, which are 95.19% and 83.37%, respectively.
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Saeed, Shaheer U., Yunguan Fu, Vasilis Stavrinides, Zachary M. C. Baum, Qianye Yang, Mirabela Rusu, Richard E. Fan et al. "Image quality assessment for machine learning tasks using meta-reinforcement learning". Medical Image Analysis 78 (maggio 2022): 102427. http://dx.doi.org/10.1016/j.media.2022.102427.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
28

AlDahoul, Nouar, e ZawZaw Htike. "Utilizing hierarchical extreme learning machine based reinforcement learning for object sorting". International Journal of ADVANCED AND APPLIED SCIENCES 6, n. 1 (gennaio 2019): 106–13. http://dx.doi.org/10.21833/ijaas.2019.01.015.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Calabuig, J. M., H. Falciani e E. A. Sánchez-Pérez. "Dreaming machine learning: Lipschitz extensions for reinforcement learning on financial markets". Neurocomputing 398 (luglio 2020): 172–84. http://dx.doi.org/10.1016/j.neucom.2020.02.052.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
30

Zhang, Junhao, e Yifei Lei. "Deep Reinforcement Learning for Stock Prediction". Scientific Programming 2022 (30 aprile 2022): 1–9. http://dx.doi.org/10.1155/2022/5812546.

Testo completo
Abstract (sommario):
Investors are frequently concerned with the potential return from changes in a company’s stock price. However, stock price fluctuations are frequently highly nonlinear and nonstationary, rendering them to be uncontrollable and the primary reason why the majority of investors earn low long-term returns. Historically, people have always simulated and predicted using classic econometric models and simple machine learning models. In recent years, an increasing amount of research has been conducted using more complex machine learning and deep learning methods to forecast stock prices, and their research reports also indicate that their prediction accuracy is gradually improving. While the prediction results and accuracy of these models improve over time, their adaptability in a volatile market environment is questioned. Highly optimized machine learning algorithms include the following: FNN and the RNN are incapable of predicting the stock price of random walks and their results are frequently not consistent with stock price movements. The purpose of this article is to increase the accuracy and speed of stock price volatility prediction by incorporating the PG method’s deep reinforcement learning model. Finally, our tests demonstrate that the new algorithm’s prediction accuracy and reward convergence speed are significantly higher than those of the traditional DRL algorithm. As a result, the new algorithm is more adaptable to fluctuating market conditions.
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Xu, Zhe, Ivan Gavran, Yousef Ahmad, Rupak Majumdar, Daniel Neider, Ufuk Topcu e Bo Wu. "Joint Inference of Reward Machines and Policies for Reinforcement Learning". Proceedings of the International Conference on Automated Planning and Scheduling 30 (1 giugno 2020): 590–98. http://dx.doi.org/10.1609/icaps.v30i1.6756.

Testo completo
Abstract (sommario):
Incorporating high-level knowledge is an effective way to expedite reinforcement learning (RL), especially for complex tasks with sparse rewards. We investigate an RL problem where the high-level knowledge is in the form of reward machines, a type of Mealy machines that encode non-Markovian reward functions. We focus on a setting in which this knowledge is a priori not available to the learning agent. We develop an iterative algorithm that performs joint inference of reward machines and policies for RL (more specifically, q-learning). In each iteration, the algorithm maintains a hypothesis reward machine and a sample of RL episodes. It uses a separate q-function defined for each state of the current hypothesis reward machine to determine the policy and performs RL to update the q-functions. While performing RL, the algorithm updates the sample by adding RL episodes along which the obtained rewards are inconsistent with the rewards based on the current hypothesis reward machine. In the next iteration, the algorithm infers a new hypothesis reward machine from the updated sample. Based on an equivalence relation between states of reward machines, we transfer the q-functions between the hypothesis reward machines in consecutive iterations. We prove that the proposed algorithm converges almost surely to an optimal policy in the limit. The experiments show that learning high-level knowledge in the form of reward machines leads to fast convergence to optimal policies in RL, while the baseline RL methods fail to converge to optimal policies after a substantial number of training steps.
Gli stili APA, Harvard, Vancouver, ISO e altri
32

Koker, Thomas E., e Dimitrios Koutmos. "Cryptocurrency Trading Using Machine Learning". Journal of Risk and Financial Management 13, n. 8 (10 agosto 2020): 178. http://dx.doi.org/10.3390/jrfm13080178.

Testo completo
Abstract (sommario):
We present a model for active trading based on reinforcement machine learning and apply this to five major cryptocurrencies in circulation. In relation to a buy-and-hold approach, we demonstrate how this model yields enhanced risk-adjusted returns and serves to reduce downside risk. These findings hold when accounting for actual transaction costs. We conclude that real-world portfolio management application of the model is viable, yet, performance can vary based on how it is calibrated in test samples.
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Chen, Irene Y., Shalmali Joshi, Marzyeh Ghassemi e Rajesh Ranganath. "Probabilistic Machine Learning for Healthcare". Annual Review of Biomedical Data Science 4, n. 1 (20 luglio 2021): 393–415. http://dx.doi.org/10.1146/annurev-biodatasci-092820-033938.

Testo completo
Abstract (sommario):
Machine learning can be used to make sense of healthcare data. Probabilistic machine learning models help provide a complete picture of observed data in healthcare. In this review, we examine how probabilistic machine learning can advance healthcare. We consider challenges in the predictive model building pipeline where probabilistic models can be beneficial, including calibration and missing data. Beyond predictive models, we also investigate the utility of probabilistic machine learning models in phenotyping, in generative models for clinical use cases, and in reinforcement learning.
Gli stili APA, Harvard, Vancouver, ISO e altri
34

Evseenko, Alla, e Dmitrii Romannikov. "Application of Deep Q-learning and double Deep Q-learning algorithms to the task of control an inverted pendulum". Transaction of Scientific Papers of the Novosibirsk State Technical University, n. 1-2 (26 agosto 2020): 7–25. http://dx.doi.org/10.17212/2307-6879-2020-1-2-7-25.

Testo completo
Abstract (sommario):
Today, such a branch of science as «artificial intelligence» is booming in the world. Systems built on the basis of artificial intelligence methods have the ability to perform functions that are traditionally considered the prerogative of man. Artificial intelligence has a wide range of research areas. One such area is machine learning. This article discusses the algorithms of one of the approaches of machine learning – reinforcement learning (RL), according to which a lot of research and development has been carried out over the past seven years. Development and research on this approach is mainly carried out to solve problems in Atari 2600 games or in other similar ones. In this article, reinforcement training will be applied to one of the dynamic objects – an inverted pendulum. As a model of this object, we consider a model of an inverted pendulum on a cart taken from the Gym library, which contains many models that are used to test and analyze reinforcement learning algorithms. The article describes the implementation and study of two algorithms from this approach, Deep Q-learning and Double Deep Q-learning. As a result, training, testing and training time graphs for each algorithm are presented, on the basis of which it is concluded that it is desirable to use the Double Deep Q-learning algorithm, because the training time is approximately 2 minutes and provides the best control for the model of an inverted pendulum on a cart.
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Shailaja, Dr M., Nune Vinaya Reddy, Ambati Srujani e Cherukuthota Upeksha Reddy. "Playing Tetris with Reinforcement Learning". International Journal for Research in Applied Science and Engineering Technology 10, n. 6 (30 giugno 2022): 2088–95. http://dx.doi.org/10.22214/ijraset.2022.44208.

Testo completo
Abstract (sommario):
Abstract: The essential inspiration for this uAndertaking was a pleasant utilization of AI. Tetris is a notable game that is cherished and loathed by a lot of people. Tetris game has a few qualitiesmaking it an intriguing issue for the field of ML. A total portrayal of the tetris issue incorporates tremendous number of states making a meaning of a non-learning procedure for all intents and purposes unthinkable. Late outcomes from the group at Google DeepMind have shown that support learning can have noteworthy execution at game playing, utilizing a negligible measure of earlier data about the game. We use support figuring out how to prepare an AI specialist to play tetris. Support learning permits the machine or programming specialist to gain proficiency with its conduct in light of the criticism that is gotten from the climate. The machine might adjust after some time or may advance once and proceed with that behavior. Tetris is played on a rectangular lattice divided into more modest square regions, regularly ten units wide by twenty units tall. The player controls the direction and even area of pieces that tumble from the highest point of the board to the base and procures focuses by framing total level lines, which are then eliminated from play, causing pieces put higher to move descending. The key speculation of this undertaking is that assuming that the focuses procured in Tetris are utilized as the prize capacity for an AI specialist, then that specialist ought to have the option to figure out how to play Tetris without other oversight.
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Dandoti, Sarosh. "Learning to Survive using Reinforcement Learning with MLAgents". International Journal for Research in Applied Science and Engineering Technology 10, n. 7 (31 luglio 2022): 3009–14. http://dx.doi.org/10.22214/ijraset.2022.45526.

Testo completo
Abstract (sommario):
Abstract: Simulations have been there for a long time, in different versions and level of complexity. Training a Reinforcement Learning model in a 3D environment lets us understand a lot of new insights from the inference. There have been some examples where the AI learns to Feed Itself, Learns to Start walking, jumping etc. The reason one trains an entire model from the agent knowing nothing to being a perfect task achiever is that during the process, new behavioral patterns can be recorded. Reinforcement Learning is a feedback-based Machine Learning technique in which an agent learns how to behave in a given environment by performing actions and observing the outcomes of those actions. For each positive action, the agent receives positive feedback; for each negative action, the agent receives negative feedback or a penalty. A general simple agent would learn to perform a task and get some reward on accomplishing it. The Agent is also given punishment if it does something that it’s not supposed to do. These simple simulations can evolve, try to use their surroundings, try to fight with other agents to accomplish their goal
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Lockwood, Owen, e Mei Si. "Reinforcement Learning with Quantum Variational Circuit". Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 16, n. 1 (1 ottobre 2020): 245–51. http://dx.doi.org/10.1609/aiide.v16i1.7437.

Testo completo
Abstract (sommario):
The development of quantum computational techniques has advanced greatly in recent years, parallel to the advancements in techniques for deep reinforcement learning. This work explores the potential for quantum computing to facilitate reinforcement learning problems. Quantum computing approaches offer important potential improvements in time and space complexity over traditional algorithms because of its ability to exploit the quantum phenomena of superposition and entanglement. Specifically, we investigate the use of quantum variational circuits, a form of quantum machine learning. We present our techniques for encoding classical data for a quantum variational circuit, we further explore pure and hybrid quantum algorithms for DQN and Double DQN. Our results indicate both hybrid and pure quantum variational circuit have the ability to solve reinforcement learning tasks with a smaller parameter space. These comparison are conducted with two OpenAI Gym environments: CartPole and Blackjack, The success of this work is indicative of a strong future relationship between quantum machine learning and deep reinforcement learning.
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Meera, A., e S. Swamynathan. "Queue Based Q-Learning for Efficient Resource Provisioning in Cloud Data Centers". International Journal of Intelligent Information Technologies 11, n. 4 (ottobre 2015): 37–54. http://dx.doi.org/10.4018/ijiit.2015100103.

Testo completo
Abstract (sommario):
Cloud Computing is a novel paradigm that offers virtual resources on demand through internet. Due to rapid demand to cloud resources, it is difficult to estimate the user's demand. As a result, the complexity of resource provisioning increases, which leads to the requirement of an adaptive resource provisioning. In this paper, the authors address the problem of efficient resource provisioning through Queue based Q-learning algorithm using reinforcement learning agent. Reinforcement learning has been proved in various domains for automatic control and resource provisioning. In the absence of complete environment model, reinforcement learning can be used to define optimal allocation policies. The proposed Queue based Q-learning agent analyses the CPU utilization of all active Virtual Machines (VMs) and detects the least loaded virtual machine for resource provisioning. It detects the least loaded virtual machines through Inter Quartile Range. Using the queue size of virtual machines it looks ahead by one time step to find the optimal virtual machine for provisioning.
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Orgován, László, Tamás Bécsi e Szilárd Aradi. "Autonomous Drifting Using Reinforcement Learning". Periodica Polytechnica Transportation Engineering 49, n. 3 (1 settembre 2021): 292–300. http://dx.doi.org/10.3311/pptr.18581.

Testo completo
Abstract (sommario):
Autonomous vehicles or self-driving cars are prevalent nowadays, many vehicle manufacturers, and other tech companies are trying to develop autonomous vehicles. One major goal of the self-driving algorithms is to perform manoeuvres safely, even when some anomaly arises. To solve these kinds of complex issues, Artificial Intelligence and Machine Learning methods are used. One of these motion planning problems is when the tires lose their grip on the road, an autonomous vehicle should handle this situation. Thus the paper provides an Autonomous Drifting algorithm using Reinforcement Learning. The algorithm is based on a model-free learning algorithm, Twin Delayed Deep Deterministic Policy Gradients (TD3). The model is trained on six different tracks in a simulator, which is developed specifically for autonomous driving systems; namely CARLA.
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Mondal, Shanka Subhra, Nikhil Sheoran e Subrata Mitra. "Scheduling of Time-Varying Workloads Using Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 10 (18 maggio 2021): 9000–9008. http://dx.doi.org/10.1609/aaai.v35i10.17088.

Testo completo
Abstract (sommario):
Resource usage of production workloads running on shared compute clusters often fluctuate significantly across time. While simultaneous spike in the resource usage between two workloads running on the same machine can create performance degradation, unused resources in a machine results in wastage and undesirable operational characteristics for a compute cluster. Prior works did not consider such temporal resource fluctuations or their alignment for scheduling decisions. Due to the variety of time-varying workloads, their complex resource usage characteristics, it is challenging to design well-defined heuristics for scheduling them optimally across different machines in a cluster. In this paper, we propose a Deep Reinforcement Learning (DRL) based approach to exploit various temporal resource usage patterns of time varying workloads as well as a technique for creating equivalence classes among a large number of production workloads to improve scalability of our method. Validations with real production traces from Google and Alibaba show that our technique can significantly improve metrics for operational excellence (e.g. utilization, fragmentation, resource exhaustion etc.) for a cluster, compared to the baselines.
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Lürig, Christoph. "Learning Machine Learning with a Game". European Conference on Games Based Learning 16, n. 1 (29 settembre 2022): 316–23. http://dx.doi.org/10.34190/ecgbl.16.1.481.

Testo completo
Abstract (sommario):
AIs playing strategic games have always fascinated humans. Specifically, the reinforcement learning technique Alpha Zero (D.Silver, 2016) has gained much attention for its capability to play Go, which was hard to crack problem for AI for a long time. Additionally, we see the rise of explainable AI (xAI), which tries to address the problem that many modern AI decision techniques are black-box approaches and incomprehensible to humans. Combining a board game AI for the relatively simple game Connect-Four with explanation techniques offers the possibility of learning something about an AI's inner workings and the game itself. This paper explains how to combine an Alpha-Zero-based AI with known explanation techniques used in supervised learning. Additionally, we combine this with known visualization approaches for trees. Alpha-Zero combines a neuronal network and a Monte-Carlo-Search-Tree. The approach we present in this paper focuses on two explanations. The first explanation is a dynamic analysis of the evolving situation, primarily based on the tree aspect, and works with a radial tree representation (Yee et al., 2001). The second explanation is a static analysis that tries to identify the relevant situation elements using the Lime (Local Interpretable Model Agnostic Explanations) approach (Christoforos Anagnostopoulos, 2020). This technique focuses primarily on the neuronal network aspect. The straightforward application of Lime towards the Monte-Carlo-Search-Tree approach would be too compute-intensive for interactive applications. We suggest a modification to accommodate search trees and sacrifice the model agnosticism specifically. We use a weighted Lasso-based approach on the different board constellations analyzed in the search tree by the neuronal network to get a final static explanation of the situation. Finally, we visually interpret the resulting linear weights from the Lasso analysis on the game board. The implementation is done in Python using the PyGame library for visualization and interaction implementation. We implemented the neuronal networks with PyTorch and the Lasso analysis with Scikit Learn. This paper provides implementation details on an experimental approach to learning something about a game and how machines learn to play a game.
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Shaveta. "A review on machine learning". International Journal of Science and Research Archive 9, n. 1 (30 maggio 2023): 281–85. http://dx.doi.org/10.30574/ijsra.2023.9.1.0410.

Testo completo
Abstract (sommario):
Machine learning is a particular branch of artificial intelligence that teaches a machine how to learn, whereas artificial intelligence (AI) is the general science that aims to emulate human abilities. An AI method called machine learning teaches computers to learn from their past experiences. Machine learning algorithms don't rely on a predetermined equation as a model, but instead "learn" information directly from data using computational techniques. As the quantity of learning examples increases, the algorithms adaptively get better at what they do. This paper provides an overview of the field as well as a variety of machine learning approaches, including supervised, unsupervised, and reinforcement learning and various languages used for machine learning.
Gli stili APA, Harvard, Vancouver, ISO e altri
43

Wang, Canjun, Zhao Li, Tong Chen, Ruishuang Wang e Zhengyu Ju. "Research on the Application of Prompt Learning Pretrained Language Model in Machine Translation Task with Reinforcement Learning". Electronics 12, n. 16 (9 agosto 2023): 3391. http://dx.doi.org/10.3390/electronics12163391.

Testo completo
Abstract (sommario):
With the continuous advancement of deep learning technology, pretrained language models have emerged as crucial tools for natural language processing tasks. However, optimization of pretrained language models is essential for specific tasks such as machine translation. This paper presents a novel approach that integrates reinforcement learning with prompt learning to enhance the performance of pretrained language models in machine translation tasks. In our methodology, a “prompt” string is incorporated into the input of the pretrained language model, to guide the generation of an output that aligns closely with the target translation. Reinforcement learning is employed to train the model in producing optimal translation results. During this training process, the target translation is utilized as a reward signal to incentivize the model to generate an output that aligns more closely with the desired translation. Experimental results validated the effectiveness of the proposed approach. The pretrained language model trained with prompt learning and reinforcement learning exhibited superior performance compared to traditional pretrained language models in machine translation tasks. Furthermore, we observed that different prompt strategies significantly impacted the model’s performance, underscoring the importance of selecting an optimal prompt strategy tailored to the specific task. The results suggest that using techniques such as prompt learning and reinforcement learning can improve the performance of pretrained language models for tasks such as text generation and machine translation. The method proposed in this paper not only offers a fresh perspective on leveraging pretrained language models in machine translation and other related tasks but also serves as a valuable reference for further research in this domain. By combining reinforcement learning with prompt learning, researchers can explore new avenues for optimizing pretrained language models and improving their efficacy in various natural language processing tasks.
Gli stili APA, Harvard, Vancouver, ISO e altri
44

Belozerov, Ilya Andreevich, e Vladimir Anatolievich Sudakov. "Reinforcement Machine Learning for Solving Mathematical Programming Problems". Keldysh Institute Preprints, n. 36 (2022): 1–14. http://dx.doi.org/10.20948/prepr-2022-36.

Testo completo
Abstract (sommario):
This paper discusses modern approaches to finding rational solutions in problems of mixed integer linear programming, both generated with random data and from real practice. The main emphasis is on how to implement the process of finding a solution to discrete optimization problems using the concept of reinforcement learning; what techniques can be applied to improve the speed and quality of work. Three main variants of the algorithm were developed using the Ray library API, as well as the environment - the Gym library. The results of the developed solver are compared with the OR-Tools library. The best model can be used as a solver for high-dimensional optimization problems, in addition, this concept is applicable to other combinatorial problems with a change in the environment code and the intelligent agent algorithm.
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Cappart, Quentin, Emmanuel Goutierre, David Bergman e Louis-Martin Rousseau. "Improving Optimization Bounds Using Machine Learning: Decision Diagrams Meet Deep Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 1443–51. http://dx.doi.org/10.1609/aaai.v33i01.33011443.

Testo completo
Abstract (sommario):
Finding tight bounds on the optimal solution is a critical element of practical solution methods for discrete optimization problems. In the last decade, decision diagrams (DDs) have brought a new perspective on obtaining upper and lower bounds that can be significantly better than classical bounding mechanisms, such as linear relaxations. It is well known that the quality of the bounds achieved through this flexible bounding method is highly reliant on the ordering of variables chosen for building the diagram, and finding an ordering that optimizes standard metrics is an NP-hard problem. In this paper, we propose an innovative and generic approach based on deep reinforcement learning for obtaining an ordering for tightening the bounds obtained with relaxed and restricted DDs. We apply the approach to both the Maximum Independent Set Problem and the Maximum Cut Problem. Experimental results on synthetic instances show that the deep reinforcement learning approach, by achieving tighter objective function bounds, generally outperforms ordering methods commonly used in the literature when the distribution of instances is known. To the best knowledge of the authors, this is the first paper to apply machine learning to directly improve relaxation bounds obtained by general-purpose bounding mechanisms for combinatorial optimization problems.
Gli stili APA, Harvard, Vancouver, ISO e altri
46

R. Merina, Queentin. "Use of reinforcement learning algorithms to optimize control strategies for single machine systems". i-manager’s Journal on Instrumentation and Control Engineering 10, n. 2 (2022): 36. http://dx.doi.org/10.26634/jic.10.2.19357.

Testo completo
Abstract (sommario):
The stability of power systems is critical to ensuring reliable and efficient operation of electrical grids. In recent years, there has been a growing interest in the use of artificial intelligence techniques, such as reinforcement learning, to improve the stability of single machine systems. Reinforcement learning is a machine learning approach that enables agents to learn optimal control policies through trial and error. In this paper, we explore the use of reinforcement learning algorithms to optimize control strategies for single machine systems. We demonstrate how these algorithms can be used to identify the best control actions to take in real-time to prevent system instability. The challenges and limitations of using reinforcement learning in power system applications are discussed and recommendations are provided for future research in this area. Our results show that reinforcement learning has great potential for improving the stability of single machine systems and can be a valuable tool for power system operators and engineers.
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Elizarov, Artem Aleksandrovich, e Evgenii Viktorovich Razinkov. "Image Classification Using Reinforcement Learning". Russian Digital Libraries Journal 23, n. 6 (12 maggio 2020): 1172–91. http://dx.doi.org/10.26907/1562-5419-2020-23-6-1172-1191.

Testo completo
Abstract (sommario):
Recently, such a direction of machine learning as reinforcement learning has been actively developing. As a consequence, attempts are being made to use reinforcement learning for solving computer vision problems, in particular for solving the problem of image classification. The tasks of computer vision are currently one of the most urgent tasks of artificial intelligence. The article proposes a method for image classification in the form of a deep neural network using reinforcement learning. The idea of ​​the developed method comes down to solving the problem of a contextual multi-armed bandit using various strategies for achieving a compromise between exploitation and research and reinforcement learning algorithms. Strategies such as -greedy, -softmax, -decay-softmax, and the UCB1 method, and reinforcement learning algorithms such as DQN, REINFORCE, and A2C are considered. The analysis of the influence of various parameters on the efficiency of the method is carried out, and options for further development of the method are proposed.
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Jiang, Weihang. "Applications of machine learning in neuroscience and inspiration of reinforcement learning for computational neuroscience". Applied and Computational Engineering 4, n. 1 (14 giugno 2023): 473–78. http://dx.doi.org/10.54254/2755-2721/4/2023308.

Testo completo
Abstract (sommario):
High-performance machine learning algorithms have always been one of the concerns of many researchers. Since its birth, machine learning has been a product of multidisciplinary integration. Especially in the field of neuroscience, models from related fields continue to inspire the development of neural networks and deepen people's understanding of neural networks. The mathematical and quantitative modeling approach to research brought about by machine learning is also feeding into the development of neuroscience. One of the emerging products of this is computational neuroscience. Computational neuroscience has been pushing the boundaries of models of brain function in recent years, and just as early studies of visual hierarchy influenced neural networks, computational neuroscience has great potential to lead to higher performance machine learning algorithms, particularly in the development of deep learning algorithms with strong links to neuroscience. In this paper, it first reviews the help and achievements of machine learning for neuroscience in recent years specially in fMRI image recognition and look at the possibilities for the future development of neural networks due to the recent development of the computational neuroscience in psychiatry of the temporal difference model for dopamine and serotonin.
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Kong, Xiang, Zhaopeng Tu, Shuming Shi, Eduard Hovy e Tong Zhang. "Neural Machine Translation with Adequacy-Oriented Learning". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 6618–25. http://dx.doi.org/10.1609/aaai.v33i01.33016618.

Testo completo
Abstract (sommario):
Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation. We attribute this to that the standard Maximum Likelihood Estimation (MLE) cannot judge the real translation quality due to its several limitations. In this work, we propose an adequacyoriented learning mechanism for NMT by casting translation as a stochastic policy in Reinforcement Learning (RL), where the reward is estimated by explicitly measuring translation adequacy. Benefiting from the sequence-level training of RL strategy and a more accurate reward designed specifically for translation, our model outperforms multiple strong baselines, including (1) standard and coverage-augmented attention models with MLE-based training, and (2) advanced reinforcement and adversarial training strategies with rewards based on both word-level BLEU and character-level CHRF3. Quantitative and qualitative analyses on different language pairs and NMT architectures demonstrate the effectiveness and universality of the proposed approach.
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Durgut, Rafet, Mehmet Emin Aydin e Abdur Rakib. "Transfer Learning for Operator Selection: A Reinforcement Learning Approach". Algorithms 15, n. 1 (17 gennaio 2022): 24. http://dx.doi.org/10.3390/a15010024.

Testo completo
Abstract (sommario):
In the past two decades, metaheuristic optimisation algorithms (MOAs) have been increasingly popular, particularly in logistic, science, and engineering problems. The fundamental characteristics of such algorithms are that they are dependent on a parameter or a strategy. Some online and offline strategies are employed in order to obtain optimal configurations of the algorithms. Adaptive operator selection is one of them, and it determines whether or not to update a strategy from the strategy pool during the search process. In the field of machine learning, Reinforcement Learning (RL) refers to goal-oriented algorithms, which learn from the environment how to achieve a goal. On MOAs, reinforcement learning has been utilised to control the operator selection process. However, existing research fails to show that learned information may be transferred from one problem-solving procedure to another. The primary goal of the proposed research is to determine the impact of transfer learning on RL and MOAs. As a test problem, a set union knapsack problem with 30 separate benchmark problem instances is used. The results are statistically compared in depth. The learning process, according to the findings, improved the convergence speed while significantly reducing the CPU time.
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!

Vai alla bibliografia