Log in

Relevant bibliographies by topics / Dynamic optimal learning rate / Journal articles

Journal articles on the topic 'Dynamic optimal learning rate'

To see the other types of publications on this topic, follow the link: Dynamic optimal learning rate.

Author: Grafiati

Published: 4 June 2021

Last updated: 20 February 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Dynamic optimal learning rate.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Chinrungrueng, C., and C. H. Sequin. "Optimal adaptive k-means algorithm with dynamic adjustment of learning rate." IEEE Transactions on Neural Networks 6, no. 1 (1995): 157–69. http://dx.doi.org/10.1109/72.363440.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Zhu, Yingqiu, Danyang Huang, Yuan Gao, Rui Wu, Yu Chen, Bo Zhang, and Hansheng Wang. "Automatic, dynamic, and nearly optimal learning rate specification via local quadratic approximation." Neural Networks 141 (September 2021): 11–29. http://dx.doi.org/10.1016/j.neunet.2021.03.025.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Leen, Todd K., Bernhard Schottky, and David Saad. "Optimal asymptotic learning rate: Macroscopic versus microscopic dynamics." Physical Review E 59, no. 1 (January 1, 1999): 985–91. http://dx.doi.org/10.1103/physreve.59.985.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Kalvit, Anand, and Assaf Zeevi. "Dynamic Learning in Large Matching Markets." ACM SIGMETRICS Performance Evaluation Review 50, no. 2 (August 30, 2022): 18–20. http://dx.doi.org/10.1145/3561074.3561081.

Full text

Abstract:

We study a sequential matching problem faced by large centralized platforms where "jobs" must be matched to "workers" subject to uncertainty about worker skill proficiencies. Jobs arrive at discrete times (possibly in batches of stochastic size and composition) with "job-types" observable upon arrival. To capture the "choice overload" phenomenon, we posit an unlimited supply of workers where each worker is characterized by a vector of attributes (aka "worker-types") sampled from an underlying population-level distribution. The distribution as well as mean payoffs for possible workerjob type-pairs are unobservables and the platform's goal is to sequentially match incoming jobs to workers in a way that maximizes its cumulative payoffs over the planning horizon. We establish lower bounds on the regret of any matching algorithm in this setting and propose a novel rate-optimal learning algorithm that adapts to aforementioned primitives online. Our learning guarantees highlight a distinctive characteristic of the problem: achievable performance only has a second-order dependence on worker-type distributions; we believe this finding may be of interest more broadly.

APA, Harvard, Vancouver, ISO, and other styles

5

Zheng, Jiangbo, Yanhong Gan, Ying Liang, Qingqing Jiang, and Jiatai Chang. "Joint Strategy of Dynamic Ordering and Pricing for Competing Perishables with Q-Learning Algorithm." Wireless Communications and Mobile Computing 2021 (March 13, 2021): 1–19. http://dx.doi.org/10.1155/2021/6643195.

Full text

Abstract:

We use Machine Learning (ML) to study firms’ joint pricing and ordering decisions for perishables in a dynamic loop. The research assumption is as follows: at the beginning of each period, the retailer prices both the new and old products and determines how many new products to order, while at the end of each period, the retailer decides how much remaining inventory should be carried over to the next period. The objective is to determine a joint pricing, ordering, and disposal strategy to maximize the total expected discounted profit. We establish a decision model based on Markov processes and use the Q-learning algorithm to obtain a near-optimal policy. From numerical analysis, we find that (i) the optimal number of old products carried over to the next period depends on the upper quantitative bound for old inventory; (ii) the optimal prices for new products are positively related to potential demand but negatively related to the decay rate, while the optimal prices for old products have a positive relationship with both; and (iii) ordering decisions are unrelated to the quantity of old products. When the decay rate is low or the variable ordering cost is high, the optimal orders exhibit a trapezoidal decline as the quantity of new products increases.

APA, Harvard, Vancouver, ISO, and other styles

6

Chen, Zhigang, Rongwei Xu, and Yongxi Yi. "Dynamic Optimal Control of Transboundary Pollution Abatement under Learning-by-Doing Depreciation." Complexity 2020 (June 9, 2020): 1–17. http://dx.doi.org/10.1155/2020/3763684.

Full text

Abstract:

This paper analyzes a dynamic Stackelberg differential game model of watershed transboundary water pollution abatement and discusses the optimal decision-making problem under non-cooperative and cooperative differential game, in which the accumulation effect and depreciation effect of learning-by-doing pollution abatement investment are taken into account. We use dynamic optimization theory to solve the equilibrium solution of models. Through numerical simulation analysis, the path simulation and analysis of the optimal trajectory curves of each variable under finite-planning horizon and long-term steady state were carried out. Under the finite-planning horizon, the longer the planning period is, the lower the optimal emission rate is in equilibrium. The long-term steady-state game under cooperative decision can effectively reduce the amount of pollution emission. The investment intensity of pollution abatement in the implementation of non-cooperative game is higher than that of cooperative game. Under the long-term steady state, the pollution abatement investment trajectory of the cooperative game is relatively stable and there is no obvious crowding out effect. Investment continues to rise, and the optimal equilibrium level at steady state is higher than that under non-cooperative decision making. The level of decline in pollution stock under finite-planning horizon is not significant. Under the condition of long-term steady state, the trajectories of upstream and downstream pollution in the non-cooperative model and cooperative model are similar, but cooperative decision-making model is superior to the non-cooperative model in terms of the period of stabilization and steady state.

APA, Harvard, Vancouver, ISO, and other styles

7

De, Shipra, and Darryl A. Seale. "Dynamic Decision Making and Race Games." ISRN Operations Research 2013 (August 7, 2013): 1–15. http://dx.doi.org/10.1155/2013/452162.

Full text

Abstract:

Frequent criticism of dynamic decision making research pertains to the overly complex nature of the decision tasks used in experimentation. To address such concerns, we study dynamic decision making with respect to a simple race game, which has a computable optimal strategy. In this two-player race game, individuals compete to be the first to reach a designated threshold of points. Players alternate rolling a desired quantity of dice. If the number one appears on any of the dice, the player receives no points for his turn; otherwise, the sum of the numbers appearing on the dice is added to the player's score. Results indicate that although players are influenced by the game state when making their decisions, they tend to play too conservatively in comparison to the optimal policy and are influenced by the behavior of their opponents. Improvement in performance was negligible with repeated play. Survey data suggests that this outcome could be due to inadequate time for learning or insufficient player motivation. However, some players approached optimal heuristic strategies, which perform remarkably well.

APA, Harvard, Vancouver, ISO, and other styles

8

Yao, Yuhang, and Carlee Joe-Wong. "Interpretable Clustering on Dynamic Graphs with Recurrent Graph Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 5 (May 18, 2021): 4608–16. http://dx.doi.org/10.1609/aaai.v35i5.16590.

Full text

Abstract:

We study the problem of clustering nodes in a dynamic graph, where the connections between nodes and nodes' cluster memberships may change over time, e.g., due to community migration. We first propose a dynamic stochastic block model that captures these changes, and a simple decay-based clustering algorithm that clusters nodes based on weighted connections between them, where the weight decreases at a fixed rate over time. This decay rate can then be interpreted as signifying the importance of including historical connection information in the clustering. However, the optimal decay rate may differ for clusters with different rates of turnover. We characterize the optimal decay rate for each cluster and propose a clustering method that achieves almost exact recovery of the true clusters. We then demonstrate the efficacy of our clustering algorithm with optimized decay rates on simulated graph data. Recurrent neural networks (RNNs), a popular algorithm for sequence learning, use a similar decay-based method, and we use this insight to propose two new RNN-GCN (graph convolutional network) architectures for semi-supervised graph clustering. We finally demonstrate that the proposed architectures perform well on real data compared to state-of-the-art graph clustering algorithms.

APA, Harvard, Vancouver, ISO, and other styles

9

Liu, Haijun. "A Study of an IT-Assisted Higher Education Model Based on Distributed Hardware-Assisted Tracking Intervention." Occupational Therapy International 2022 (April 8, 2022): 1–12. http://dx.doi.org/10.1155/2022/8862716.

Full text

Abstract:

This paper presents an in-depth study and analysis of the model of higher education using distributed hardware tracking intervention of information technology. The MEC-based dynamic adaptive video stream caching technology model is proposed. The model dynamically adjusts the bit rate by referring to the broadband estimation and cache occupancy data to ensure users have a smooth experience effect. Simulation results show that the model has fewer transcoding times and generates lower latency than the traditional model, which is suitable for dual-teacher classroom scenarios and further improves the quality of the user’s video viewing experience. The model uses an edge cloud collaborative architecture to migrate the rendering technology to an edge server closer to the user side, enabling real-time interaction, computation, and rendering, reducing the time of data transmission as well as computation time. According to the blended learning-based adaptive intervention model, three rounds of teaching practice are conducted to validate the effectiveness of the intervention model in terms of both student process performance and outcome performance, thereby improving learning adaptability and improving learning effect. Teachers’ teaching has a significant impact on learning motivation ( β = 0.311 , p < 0.01 ), which in turn affects learning adaptability. Teachers use scientific teaching methods to stimulate students’ learning motivation, mobilize enthusiasm, and improve learning adaptability. Under the communication topology of the system as a directed graph, a multi-intelligent system dynamic model with grouping is established; i.e., the intragroup intelligence has the same dynamics but is different between groups, and all system dynamics are unknown. The proposed novel policy iterative algorithm is used to learn the optimal control protocol and achieve optimal consistency control. The effectiveness of the algorithm is demonstrated by the simulation experimental results. The simulation results show that the model has lower latency and energy consumption compared to the cloud rendering model, which is suitable for the safety education classroom scenario and solves the outstanding problems of network connection rate and cloud service latency.

APA, Harvard, Vancouver, ISO, and other styles

10

Li, Ao, Zhaoman Wan, and Zhong Wan. "Optimal Design of Online Sequential Buy-Price Auctions with Consumer Valuation Learning." Asia-Pacific Journal of Operational Research 37, no. 03 (April 29, 2020): 2050012. http://dx.doi.org/10.1142/s0217595920500128.

Full text

Abstract:

Buy-price auction has been successfully used as a new channel of online sales. This paper studies an online sequential buy-price auction problem, where a seller has an inventory of identical products and needs to clear them through a sequence of online buy-price auctions such that the total profit is maximized by optimizing the buy price in each auction. We propose a methodology by dynamic programming approach to solve this optimization problem. Since the consumers’ behavior affects the seller’s revenue, the consumers’ strategy used in this auction is first investigated. Then, two different dynamic programming models are developed to optimize the seller’s decision-making: one is the clairvoyant model corresponding to a situation where the seller has complete information about consumer valuations, and the other is the Bayesian learning model where the seller makes optimal decisions by continuously recording and utilizing auction data during the sales process. Numerical experiments are employed to demonstrate the impacts of several key factors on the optimal solutions, including the size of inventory, the number of potential consumers, and the rate at which the seller discounts early incomes. It is shown that when the consumers’ valuations are uniformly distributed, the Bayesian learning model is of great efficiency if the demand is adequate.

APA, Harvard, Vancouver, ISO, and other styles

11

Wang, Xing-Ju, Xiao-Ming Xi, and Gui-Feng Gao. "Reinforcement Learning Ramp Metering without Complete Information." Journal of Control Science and Engineering 2012 (2012): 1–8. http://dx.doi.org/10.1155/2012/208456.

Full text

Abstract:

This paper develops a model of reinforcement learning ramp metering (RLRM) without complete information, which is applied to alleviate traffic congestions on ramps. RLRM consists of prediction tools depending on traffic flow simulation and optimal choice model based on reinforcement learning theories. Moreover, it is also a dynamic process with abilities of automaticity, memory and performance feedback. Numerical cases are given in this study to demonstrate RLRM such as calculating outflow rate, density, average speed, and travel time compared to no control and fixed-time control. Results indicate that the greater is the inflow, the more is the effect. In addition, the stability of RLRM is better than fixed-time control.

APA, Harvard, Vancouver, ISO, and other styles

12

Shi, Yuanji, Zhiwei Yuan, Xiaorong Zhu, and Hongbo Zhu. "An Adaptive Routing Algorithm for Inter-Satellite Networks Based on the Combination of Multipath Transmission and Q-Learning." Processes 11, no. 1 (January 5, 2023): 167. http://dx.doi.org/10.3390/pr11010167.

Full text

Abstract:

In a satellite network, the inter-satellite link can facilitate the information transmission and exchange between satellites, and the packet routing of the inter-satellite link is the key development direction of satellite communication systems. Aiming at the complex topology and dynamic change in LEO satellite networks, the traditional single shortest path algorithm can no longer meet the optimal path requirement. Therefore, this paper proposes a multi-path routing algorithm based on an improved breadth-first search. First, according to the inter-satellite network topology information, the improved breadth-first search algorithm is used to obtain all the front hop node information of the destination node. Second, all the shortest paths are obtained by backtracking the path through the front hop node. Finally, according to the inter-satellite network, the bandwidth capacity of the traffic and nodes determines the optimal path from multiple shortest paths. However, due to the high dynamics of low-orbit satellite networks, the topology changes rapidly, and the global topology of the network is often not available. At this time, in order to enhance the adaptability of the algorithm, this paper proposes an inter-satellite network dynamic routing algorithm based on reinforcement learning. Verified by simulation experiments, the proposed algorithm can improve the throughput of the inter-satellite network, reduce the time delay, and the packet loss rate.

APA, Harvard, Vancouver, ISO, and other styles

13

Minghai Yuan, Minghai Yuan, Chenxi Zhang Minghai Yuan, Kaiwen Zhou Chenxi Zhang, and Fengque Pei Kaiwen Zhou. "Real-time Allocation of Shared Parking Spaces Based on Deep Reinforcement Learning." 網際網路技術學刊 24, no. 1 (January 2023): 035–43. http://dx.doi.org/10.53106/160792642023012401004.

Full text

Abstract:

<p>Aiming at the parking space heterogeneity problem in shared parking space management, a multi-objective optimization model for parking space allocation is constructed with the optimization objectives of reducing the average walking distance of users and improving the utilization rate of parking spaces, a real-time allocation method for shared parking spaces based on deep reinforcement learning is proposed, which includes a state space for heterogeneous regions, an action space based on policy selection and a reward function with variable coefficients. To accurately evaluate the model performance, dynamic programming is used to derive the theoretical optimal values. Simulation results show that the improved algorithm not only improves the training success rate, but also increases the Agent performance by at least 12.63% and maintains the advantage for different sizes of parking demand, reducing the user walking distance by 53.58% and improving the parking utilization by 6.67% on average, and keeping the response time less than 0.2 seconds.</p> <p> </p>

APA, Harvard, Vancouver, ISO, and other styles

14

Jepma, Marieke, Stephen B. R. E. Brown, Peter R. Murphy, Stephany C. Koelewijn, Boukje de Vries, Arn M. van den Maagdenberg, and Sander Nieuwenhuis. "Noradrenergic and Cholinergic Modulation of Belief Updating." Journal of Cognitive Neuroscience 30, no. 12 (December 2018): 1803–20. http://dx.doi.org/10.1162/jocn_a_01317.

Full text

Abstract:

To make optimal predictions in a dynamic environment, the impact of new observations on existing beliefs—that is, the learning rate—should be guided by ongoing estimates of change and uncertainty. Theoretical work has proposed specific computational roles for various neuromodulatory systems in the control of learning rate, but empirical evidence is still sparse. The aim of the current research was to examine the role of the noradrenergic and cholinergic systems in learning rate regulation. First, we replicated our recent findings that the centroparietal P3 component of the EEG—an index of phasic catecholamine release in the cortex—predicts trial-to-trial variability in learning rate and mediates the effects of surprise and belief uncertainty on learning rate (Study 1, n = 17). Second, we found that pharmacological suppression of either norepinephrine or acetylcholine activity produced baseline-dependent effects on learning rate following nonobvious changes in an outcome-generating process (Study 1). Third, we identified two genes, coding for α2A receptor sensitivity ( ADRA2A) and norepinephrine reuptake ( NET), as promising targets for future research on the genetic basis of individual differences in learning rate (Study 2, n = 137). Our findings suggest a role for the noradrenergic and cholinergic systems in belief updating and underline the importance of studying interactions between different neuromodulatory systems.

APA, Harvard, Vancouver, ISO, and other styles

15

Xiang, Yao, Jingling Yuan, Ruiqi Luo, Xian Zhong, and Tao Li. "An Energy Dynamic Control Algorithm Based on Reinforcement Learning for Data Centers." International Journal of Pattern Recognition and Artificial Intelligence 33, no. 13 (December 15, 2019): 1951009. http://dx.doi.org/10.1142/s0218001419510091.

Full text

Abstract:

In recent years, how to use renewable energy to reduce the energy cost of internet data center (IDC) has been an urgent problem to be solved. More and more solutions are beginning to consider machine learning, but many of the existing methods need to take advantage of some future information, which is difficult to obtain in the actual operation process. In this paper, we focus on reducing the energy cost of IDC by controlling the energy flow of renewable energy without any future information. we propose an efficient energy dynamic control algorithm based on the theory of reinforcement learning, which approximates the optimal solution by learning the feedback of historical control decisions. For the purpose of avoiding overestimation, improving the convergence ability of the algorithm, we use the double [Formula: see text]-method to further optimize. The extensive experimental results show that our algorithm can on average save the energy cost by 18.3% and reduce the rate of grid intervention by 26.2% compared with other algorithms, and thus has good application prospects.

APA, Harvard, Vancouver, ISO, and other styles

16

Chiu, Kai-Cheng, Chien-Chang Liu, and Li-Der Chou. "Reinforcement Learning-Based Service-Oriented Dynamic Multipath Routing in SDN." Wireless Communications and Mobile Computing 2022 (January 31, 2022): 1–16. http://dx.doi.org/10.1155/2022/1330993.

Full text

Abstract:

The increasing quality and various requirements of network services are guaranteed because of the advancement of the emerging network paradigm, software-defined networking (SDN), and benefits from the centralized and software-defined architecture. The SDN not only facilitates the configuration of the network policies for traffic engineering but also brings convenience for network state obtainment. The traffic of numerous services is transmitted within a network, whereas each service may demand different network metrics, such as low latency or low packet loss rate. Corresponding quality of service policies must be enforced to meet the requirements of different services, and the balance of link utilization is also indispensable. In this research, Reinforcement Discrete Learning-Based Service-Oriented Multipath Routing (RED-STAR) has been proposed to understand the policy of distributing an optimal path for each service. The RED-STAR takes the network state and service type as input values to dynamically select the path a service must be forwarded. Custom protocols are designed for network state obtainment, and a deep learning-based traffic classification model is also integrated to identify network services. With the differentiated reward scheme for every service type, the reinforcement learning model in RED-STAR gradually achieves high reward values in various scenarios. The experimental results show that RED-STAR can adopt the dynamic network environment, obtaining the highest average reward value of 1.8579 and the lowest average maximum bandwidth utilization of 0.3601 among all path distribution schemes in a real-case scenario.

APA, Harvard, Vancouver, ISO, and other styles

17

Ding, Fan, Yongyi Zhang, Rui Chen, Zhanwen Liu, and Huachun Tan. "A Deep Learning Based Traffic State Estimation Method for Mixed Traffic Flow Environment." Journal of Advanced Transportation 2022 (April 7, 2022): 1–12. http://dx.doi.org/10.1155/2022/2166345.

Full text

Abstract:

Traffic state estimation plays a fundamental role in traffic control and management. In the connected vehicles (CVs) environment, more traffic-related data perceived and interacted by CVs can be used to estimate traffic state. However, when there is a low penetration rate of CVs, the data collected from CVs would be inadequate. Meanwhile, the representativeness of the collected data is positively correlated with the penetration rate. This article presents a traffic state estimation method based on a deep learning algorithm under a low and dynamic CVs penetration rate environment. Specifically, we design a K-Nearest Neighbor (KNN) data filling model integrating acceleration data to solve the problem of insufficient data. This method can fuse the time feature of speed by acceleration modification and mine the distribution features of speed by KNN. In addition, to reduce the estimation error caused by penetration rate, we design a Long Short-Term Memory (LSTM) model, which uses penetration rate estimated by Macroscopic Fundamental Diagram (MFD) as one of the input factors. Finally, we use the concept of operational efficiency for reference, dividing traffic state into three categories according to the estimated speed: free flow, optimal flow, and congestion. SUMO is used to simulate traffic cases under different penetration rates to evaluate our scheme. The results suggest that our data filling model can significantly improve filling accuracy under a low penetration rate; there is also a better performance of our estimation model than that of other comparison models in both low and dynamic penetration rates.

APA, Harvard, Vancouver, ISO, and other styles

18

Chan, Felix T. S., Zhengxu Wang, Yashveer Singh, X. P. Wang, J. H. Ruan, and M. K. Tiwari. "Activity scheduling and resource allocation with uncertainties and learning in activities." Industrial Management & Data Systems 119, no. 6 (July 8, 2019): 1289–320. http://dx.doi.org/10.1108/imds-01-2019-0002.

Full text

Abstract:

Purpose The purpose of this paper is to develop a model which schedules activities and allocates resources in a resource constrained project management problem. This paper also considers learning rate and uncertainties in the activity durations. Design/methodology/approach An activity schedule with requirements of different resource units is used to calculate the objectives: makespan and resource efficiency. A comparisons between non-dominated sorting genetic algorithm – II (NSGA-II) and non-dominated sorting genetic algorithm – III (NSGA-III) is done to calculate near optimal solutions. Buffers are introduced in the activity schedule to take uncertainty into account and learning rate is used to incorporate the learning effect. Findings The results show that NSGA-III gives better near optimal solutions than NSGA-II for multi-objective problem with different complexities of activity schedule. Research limitations/implications The paper does not considers activity sequencing with multiple activity relations (for instance partial overlapping among different activities) and dynamic events occurring in between or during activities. Practical implications The paper helps project managers in manufacturing industry to schedule the activities and allocate resources for a near-real world environment. Originality/value This paper takes into account both the learning rate and the uncertainties in the activity duration for a resource constrained project management problem. The uncertainty in both the individual durations of activities and the whole project duration time is taken into consideration. Genetic algorithms were used to solve the problem at hand.

APA, Harvard, Vancouver, ISO, and other styles

19

Starling, Carlos, Jackson Machado-Pinto, Unaí Tupinambás, Estevão Urbano Silva, and Bráulio R. G. M. Couto. "404. COVID-19 Normality Rate: Criteria for Optimal Time to Return to In-person Learning." Open Forum Infectious Diseases 8, Supplement_1 (November 1, 2021): S303—S304. http://dx.doi.org/10.1093/ofid/ofab466.605.

Full text

Abstract:

Abstract Background The COVID-19 pandemic created the most severe global education disruption in history. According to UNESCO, at the peak of the crisis over 1.6 billion learners in more than 190 countries were out of school. After one year, half of the world’s student population is still affected by full or partial school closures. Here we investigated whether or not it is possible to build a multivariate score for dynamic school decision-making specially in scenarios without population-scale RT-PCR tests. Methods Normality rate is based on a COVID-19 risk matrix (Table 1). Total score (TS) is obtained by summing the risk scores for COVID-19, considering the six parameters of the pandemic in a city. The COVID-19 Normality Rate (CNR) is obtained by linear interpolation in such a way that a total score of 30 points is equivalent to a 100% possibility of normality and, in a city with only six total points would have zero percent chance of returning to normality: CNR = (TS – 6)/24 (%). The criteria for opening and closing schools can be defined based on the percentages of return to normality (Table 2). Table 1. Limits for each parameter of the risk matrix and "normality" scores in relation to COVID-19: the lower the risk, higher is the “normality” score. Table 2. Criteria for opening and closing schools in a city according to the COVID-19 Normality Rate. Results at June 3rd, 2021, we evaluated all 5,570 Brazilian cities (Figure 1): 2,708 cities (49%) with COVID-19 normality rate less than 50% (full schools closure), 2,223 cities (40%) with normality rate between 50% and 70% (in-person learning only for 5 years and 8 months-old children), 583 with normality rate between 71% and 80% (in-person learning extended to children age 12 years and less), 583 cities (1%) with normality rate between 81% to 90% (in-person learning extended to the student population age 18 years), and just one city with 92% COVID-19 normality rate (in-person learning extended to all the student population). We calculated the COVID-19 normality rate between January and May, 2021, in four countries: Brazil, USA, UK, and Italy (Figure 2). At Jun, 3rd, 2021, percentage of people fully vaccinated in Brazil varied from 0% to 69%, an average of 11%. Figure 1. COVID-19 Normality Rate in 5,570 cities in Brazil, Jun/03/2021. Figure 2. COVID-19 Normality Rate between January and May, 2021: comparison among Brazil, USA, UK, and Italy. Conclusion COVID-19 vaccination programs take several months to implement. Besides fully vaccination of the population, it is important to check if people became really safe from the virus. The COVID-19 Normality Rate is a double check multivariate score that can be used as a criteria for optimal time to return to in-person learning safely. Disclosures All Authors: No reported disclosures

APA, Harvard, Vancouver, ISO, and other styles

20

Thanh, Pham Duy, Tran Nhut Khai Hoan, Hoang Thi Huong Giang, and Insoo Koo. "Cache-Enabled Data Rate Maximization for Solar-Powered UAV Communication Systems." Electronics 9, no. 11 (November 20, 2020): 1961. http://dx.doi.org/10.3390/electronics9111961.

Full text

Abstract:

Currently, deploying fixed terrestrial infrastructures is not cost-effective in temporary circumstances, such as natural disasters, hotspots, and so on. Thus, we consider a system of caching-based UAV-assisted communications between multiple ground users (GUs) and a local station (LS). Specifically, a UAV is exploited to cache data from the LS and then serve GUs’ requests to handle the issue of unavailable or damaged links from the LS to the GUs. The UAV can harvest solar energy for its operation. We investigate joint cache scheduling and power allocation schemes by using the non-orthogonal multiple access (NOMA) technique to maximize the long-term downlink rate. Two scenarios for the network are taken into account. In the first, the harvested energy distribution of the GUs is assumed to be known, and we propose a partially observable Markov decision process framework such that the UAV can allocate optimal transmission power for each GU based on proper content caching over each flight period. In the second scenario where the UAV does not know the environment’s dynamics in advance, an actor-critic-based scheme is proposed to achieve a solution by learning with a dynamic environment. Afterwards, the simulation results verify the effectiveness of the proposed methods, compared to baseline approaches.

APA, Harvard, Vancouver, ISO, and other styles

21

Wei, Kefeng, Lincong Zhang, Xin Jiang, and Yi Guo. "Q -Learning-Based High Credibility and Stability Routing Algorithm for Internet of Medical Things." Wireless Communications and Mobile Computing 2020 (December 26, 2020): 1–10. http://dx.doi.org/10.1155/2020/8856271.

Full text

Abstract:

With the outbreak of COVID-19, people’s demand for using the Internet of Medical Things (IoMT) for physical health monitoring has increased dramatically. The considerable amount of data requires stable, reliable, and real-time transmission, which has become an urgent problem to be solved. This paper constructs a health monitoring-enabled IoMT network which is composed of several users carrying wearable devices and a coordinator. One of the important problems for the proposed network is the unstable and inefficient transmission of data packets caused by node congestion and link breakage in the routing process. Based on these, we propose a Q -learning-based dynamic routing selection (QDRS) algorithm. First, a mathematical model of path optimization and a solution named Global Routing selection with high Credibility and Stability (GRCS) is proposed to select the optimal path globally. However, during the data transmission through the optimal path, the node and link status may change, causing packet loss or retransmission. This is a problem not considered by standard routing algorithms. Therefore, this paper proposes a local link dynamic adjustment scheme based on GRCS, using the Q -learning algorithm to select the optimal next-hop node for each intermediate forwarding node. If the selected node is not the same as the original path, the chosen node replaces the downstream node in the original path and so corrects the optimal path in time. This paper considers the congestion state, remaining energy, and mobility of the node when selecting the path and considers the network state changes during packet transmission, which is the most significant innovation of this paper. The simulation results show that compared with other similar algorithms, the proposed algorithm can significantly improve the packet forwarding rate without seriously affecting the network energy consumption and delay.

APA, Harvard, Vancouver, ISO, and other styles

22

Cao, Huazhen, Chong Gao, Xuan He, Yang Li, and Tao Yu. "Multi-Agent Cooperation Based Reduced-Dimension Q(λ) Learning for Optimal Carbon-Energy Combined-Flow." Energies 13, no. 18 (September 14, 2020): 4778. http://dx.doi.org/10.3390/en13184778.

Full text

Abstract:

This paper builds an optimal carbon-energy combined-flow (OCECF) model to optimize the carbon emission and energy losses of power grids simultaneously. A novel multi-agent cooperative reduced-dimension Q(λ) (MCR-Q(λ)) is proposed for solving the model. Firstly, on the basis of the traditional single-objective Q(λ) algorithm, the solution space is reduced effectively to shrink the size of Q-value matrices. Then, based on the concept of ant cooperative cooperation, multi-agents are used to update the Q-value matrices iteratively, which can significantly improve the updating rate. The simulation in the IEEE 118-bus system indicates that the proposed technique can decrease the convergence speed by hundreds of times as compared with conventional Q(λ), keeping high global stability, which is very suitable for dynamic OCECF in a large and complex power grid compared with other algorithms.

APA, Harvard, Vancouver, ISO, and other styles

23

Rodriguez, Renato, Yan Wang, Joseph Ozanne, Dogan Sumer, Dimitar Filev, and Damoon Soudbakhsh. "Adaptive Takeoff Maneuver Optimization of a Sailing Boat for America’s Cup." Journal of Sailing Technology 7, no. 01 (October 17, 2022): 88–103. http://dx.doi.org/10.5957/jst/2022.7.4.88.

Full text

Abstract:

This paper presents the development of optimal takeoff maneuvers for an AC75 foiling sailboat competing in the oldest and most prestigious sailboat competition in the world, America’s Cup. The AC75 sailboat presents many challenges to developing these optimal maneuvers due to its nonlinear, high-dimensional, and highly unstable dynamics. During the takeoff maneuver, the boat starts in the water with low speed (displacement mode) and increases its speed until it reaches steady-state foiling (the hull stays out of the water). We optimized the time for the boat’s transitions from displacement mode to foiling mode while maximizing the projection of the velocity in the desired target direction (VMG). We used an adaptive control approach to obtain these optimal maneuvers, which involved using a high-fidelity sailboat simulator for data generation and Jacobian learning for optimization. The optimal solutions were subject to value and rate constraints based on the physical limitations of the actuators, as well as the constraints enforced by human (sailors) abilities to perform such maneuvers. The optimal takeoff maneuver had an average VMG of 7.42 [m s-1], the boat reached the desired takeoff velocity in 14.8 [s] and completed the entire maneuver in 36.4 [s]. The optimal solutions provide insightful information about the dynamic behavior of this complex system and serve as benchmarks for the sailors.

APA, Harvard, Vancouver, ISO, and other styles

24

DE FRANCO, CARMINE, JOHANN NICOLLE, and HUYÊN PHAM. "BAYESIAN LEARNING FOR THE MARKOWITZ PORTFOLIO SELECTION PROBLEM." International Journal of Theoretical and Applied Finance 22, no. 07 (November 2019): 1950037. http://dx.doi.org/10.1142/s0219024919500377.

Full text

Abstract:

We study the Markowitz portfolio selection problem with unknown drift vector in the multi-dimensional framework. The prior belief on the uncertain expected rate of return is modeled by an arbitrary probability law, and a Bayesian approach from filtering theory is used to learn the posterior distribution about the drift given the observed market data of the assets. The Bayesian Markowitz problem is then embedded into an auxiliary standard control problem that we characterize by a dynamic programming method and prove the existence and uniqueness of a smooth solution to the related semi-linear partial differential equation (PDE). The optimal Markowitz portfolio strategy is explicitly computed in the case of a Gaussian prior distribution. Finally, we measure the quantitative impact of learning, updating the strategy from observed data, compared to nonlearning, using a constant drift in an uncertain context, and analyze the sensitivity of the value of information with respect to various relevant parameters of our model.

APA, Harvard, Vancouver, ISO, and other styles

25

Wang, Yi, and Junhai Sun. "Design and Implementation of Virtual Reality Interactive Product Software Based on Artificial Intelligence Deep Learning Algorithm." Advances in Multimedia 2022 (April 26, 2022): 1–7. http://dx.doi.org/10.1155/2022/9104743.

Full text

Abstract:

The aim of this study is to improve the interactive needs of artificial intelligence in the virtual reality environment. Based on the in-depth study of the interactive needs of virtual reality, a virtual reality interactive glove based on nine-axis inertial sensor and realized by artificial intelligence deep learning algorithm is designed. The AI deep learning algorithms employed include the KNN, SVM, Fuzzy, PNN, and DTW algorithms. Static gesture recognition is relatively simple, dynamic gesture recognition needs to a dynamic real-time gesture sequence data starting point and end point planning, by building the directed graph structure, quickly retrieving the global optimal solution, and determining gesture starting point, with dynamic planning to solve the minimum distance between two points, avoid the graph, and improve efficiency. The results showed that by 50 gestures such as select object, attract object, zoom object, rotate object, shoot small box, exhale menu, and close the menu, the recognition rate is 100%, 94%, 96%, 100%, 92%, 100%, and 100%. The motion data of finger and palm are captured by nine axis sensor, and the gesture recognition is carried out by using artificial intelligence deep learning algorithm. It is proved that the artificial intelligence deep learning algorithm can effectively realize the design of virtual reality interactive product software.

APA, Harvard, Vancouver, ISO, and other styles

26

Shi, Junqing, Fengxiang Qiao, Qing Li, Lei Yu, and Yongju Hu. "Application and Evaluation of the Reinforcement Learning Approach to Eco-Driving at Intersections under Infrastructure-to-Vehicle Communications." Transportation Research Record: Journal of the Transportation Research Board 2672, no. 25 (October 1, 2018): 89–98. http://dx.doi.org/10.1177/0361198118796939.

Full text

Abstract:

Eco-driving behavior is able to improve vehicles’ fuel consumption efficiency and minimize exhaust emissions, especially with the presence of infrastructure-to-vehicle (I2V) communications for connected vehicles. Several techniques such as dynamic programming and neural networks have been proposed to study eco-driving behavior. However, most techniques need a complicated problem-solving process and cannot be applied to dynamic traffic conditions. Comparatively, reinforcement learning (RL) presents great potential for self-learning to take actions in a complicated environment to achieve the optimal mapping between traffic conditions and the corresponding optimal control action of a vehicle. In this paper, a vehicle was treated as an agent to select its maneuver, that is, acceleration, cruise speed, and deceleration, according to dynamic conditions while approaching a signalized intersection equipped with I2V communication. An improved cellular automation model was utilized as the simulation platform. Three parameters, including the distance between the vehicle and the intersection, signal status, and instant vehicle speeds, were selected to characterize real-time traffic state. The total CO2 emitted by the vehicle on the approach to the intersection serves as a measure of reward policy that informs the vehicle how good its operation was. The Q-learning algorithm was utilized to optimize vehicle driving behaviors for eco-driving. Vehicle exhaust emissions and traffic performance (travel time, stop duration, and stop rate) were evaluated in two cases: (1) an isolated intersection, and (2) a medium-scale realistic network. Simulation results showed that the eco-driving behavior obtained by RL can not only reduce emissions but also optimize traffic performance.

APA, Harvard, Vancouver, ISO, and other styles

27

Zhang, Xiyue, and Guiping Chen. "Machine Learning Model-Based English Project Learning and Functional Research." Wireless Communications and Mobile Computing 2022 (April 4, 2022): 1–11. http://dx.doi.org/10.1155/2022/1940375.

Full text

Abstract:

Under the background of the rapid development of machine learning and information technology, traditional classroom mode is gradually replaced by media classroom. To tackle abstract and incomprehensible problems and restricted practical teaching in practical teaching, online classroom design principle is utilized to embody the classroom optimization and design principle of English language teaching from the perspective of machine learning based on English project learning. Based on the advantages of digitization, deep learning algorithm is used to establish classroom application model by information input. Besides, English language and relevant language application scenarios are presented from the perspective of machinery. The simulated textbook contents and relevant extended knowledge points are displayed in classrooms by online teaching. The current advantages of Internet communication are combined with machine learning algorithms for field simulations and calculations on the relevant course content of English subjects. The actual operation process in the English learning process is realized through the online form of the network. It ensures the stability and transmission accuracy of online classrooms, reduces information omission and loss during data transmission, and obtains the optimal solution for data simulation in real-time scenarios. Relevant researches demonstrate that machine learning combined with online classroom design breaks through the face-to-face book teaching in traditional classrooms by the dynamic demonstration of life and actual work scenes and entity innovations as well as design. Besides, it stimulates students’ interest in English courses, enhances the overall learning rate, promotes more significant effects of English project learning, and is conducive to the cultivation of comprehensive language talents in the new age.

APA, Harvard, Vancouver, ISO, and other styles

28

Chen, Jinyu, Ziqi Zhong, Qindi Feng, and Lei Liu. "The Multimodal Emotion Information Analysis of E-Commerce Online Pricing in Electronic Word of Mouth." Journal of Global Information Management 30, no. 11 (April 7, 2022): 1–17. http://dx.doi.org/10.4018/jgim.315322.

Full text

Abstract:

E-commerce has developed rapidly, and product promotion refers to how e-commerce promotes consumers' consumption activities. The demand and computational complexity in the decision-making process are urgent problems to be solved to optimize dynamic pricing decisions of the e-commerce product lines. Therefore, a Q-learning algorithm model based on the neural network is proposed on the premise of multimodal emotion information recognition and analysis, and the dynamic pricing problem of the product line is studied. The results show that a multi-modal fusion model is established through the multi-modal fusion of speech emotion recognition and image emotion recognition to classify consumers' emotions. Then, they are used as auxiliary materials for understanding and analyzing the market demand. The long short-term memory (LSTM) classifier performs excellent image feature extraction. The accuracy rate is 3.92%-6.74% higher than that of other similar classifiers, and the accuracy rate of the image single-feature optimal model is 9.32% higher than that of the speech single-feature model.

APA, Harvard, Vancouver, ISO, and other styles

29

Zhou, Tao, Zengchuan Dong, Xiuxiu Chen, and Qihua Ran. "Decision Support Model for Ecological Operation of Reservoirs Based on Dynamic Bayesian Network." Water 13, no. 12 (June 14, 2021): 1658. http://dx.doi.org/10.3390/w13121658.

Full text

Abstract:

In this study, a model was proposed based on the sustainable boundary approach, to provide decision support for reservoir ecological operation with the dynamic Bayesian network. The proposed model was developed in four steps: (1) calculating and verifying the sustainable boundaries in combination with the ecological objectives of the study area, (2) generating the learning samples by establishing an optimal operation model and a Monte Carlo simulation model, (3) establishing and training a dynamic Bayesian network by learning the examples and (4) calculating the probability of the economic and ecological targets exceeding the set threshold from time to time with the trained dynamic Bayesian network model. Using the proposed model, the water drawing of the reservoir can be adjusted dynamically according to the probability of the economic and ecological targets exceeding the set threshold during reservoir operation. In this study, the proposed model was applied to the middle reaches of Heihe River, the effect of water supply proportion on the probability of the economic target exceeding the set threshold was analyzed, and the response of the reservoir water storage in each period to the probability of the target exceeding the set threshold was calculated. The results show that the risks can be analyzed with the proposed model. Compared with the existing studies, the proposed model provides guidance for the ecological operation of the reservoir from time to time and technical support for the formulation of reservoir operation chart. Compared with the operation model based on the designed guaranteed rate, the reservoir operation model based on uncertainty reduces the variation range of ecological flow shortage or the overflow rate and the economic loss rate by 5% and 6%, respectively. Thus, it can be seen that the decision support model based on the dynamic Bayesian network can effectively reduce the influence of water inflow and rainfall uncertainties on reservoir operation.

APA, Harvard, Vancouver, ISO, and other styles

30

Louta, M., P. Sarigiannidis, S. Misra, P. Nicopolitidis, and G. Papadimitriou. "RLAM: A Dynamic and Efficient Reinforcement Learning-Based Adaptive Mapping Scheme in Mobile WiMAX Networks." Mobile Information Systems 10, no. 2 (2014): 173–96. http://dx.doi.org/10.1155/2014/213056.

Full text

Abstract:

WiMAX (Worldwide Interoperability for Microwave Access) constitutes a candidate networking technology towards the 4G vision realization. By adopting the Orthogonal Frequency Division Multiple Access (OFDMA) technique, the latest IEEE 802.16x amendments manage to provide QoS-aware access services with full mobility support. A number of interesting scheduling and mapping schemes have been proposed in research literature. However, they neglect a considerable asset of the OFDMA-based wireless systems: the dynamic adjustment of the downlink-to-uplink width ratio. In order to fully exploit the supported mobile WiMAX features, we design, develop, and evaluate a rigorous adaptive model, which inherits its main aspects from the reinforcement learning field. The model proposed endeavours to efficiently determine the downlink-to-uplinkwidth ratio, on a frame-by-frame basis, taking into account both the downlink and uplink traffic in the Base Station (BS). Extensive evaluation results indicate that the model proposed succeeds in providing quite accurate estimations, keeping the average error rate below 15% with respect to the optimal sub-frame configurations. Additionally, it presents improved performance compared to other learning methods (e.g., learning automata) and notable improvements compared to static schemes that maintain a fixed predefined ratio in terms of service ratio and resource utilization.

APA, Harvard, Vancouver, ISO, and other styles

31

Ou, Minghui, Hua Wei, Yiyi Zhang, and Jiancheng Tan. "A Dynamic Adam Based Deep Neural Network for Fault Diagnosis of Oil-Immersed Power Transformers." Energies 12, no. 6 (March 14, 2019): 995. http://dx.doi.org/10.3390/en12060995.

Full text

Abstract:

This paper presents a Dynamic Adam and dropout based deep neural network (DADDNN) for fault diagnosis of oil-immersed power transformers. To solve the problem of incomplete extraction of hidden information with data driven, the gradient first-order moment estimate and second-order moment estimate are used to calculate the different learning rates for all parameters with stable gradient scaling. Meanwhile, the learning rate is dynamically attenuated according to the optimal interval. To prevent over-fitted, we exploit dropout technique to randomly reset some neurons and strengthen the information exchange between indirectly-linked neurons. Our proposed approach was utilized on four datasets to learn the faults diagnosis of oil-immersed power transformers. Besides, four benchmark cases in other fields were also utilized to illustrate its scalability. The simulation results show that the average diagnosis accuracies on the four datasets of our proposed method were 37.9%, 25.5%, 14.6%, 18.9%, and 11.2%, higher than international electro technical commission (IEC), Duval Triangle, stacked autoencoders (SAE), deep belief networks (DBN), and grid search support vector machines (GSSVM), respectively.

APA, Harvard, Vancouver, ISO, and other styles

32

Wang, Ziwei, Xin Wang, Yijie Tang, Ying Liu, and Jun Hu. "Optimal Tracking Control of a Nonlinear Multiagent System Using Q-Learning via Event-Triggered Reinforcement Learning." Entropy 25, no. 2 (February 5, 2023): 299. http://dx.doi.org/10.3390/e25020299.

Full text

Abstract:

This article offers an optimal control tracking method using an event-triggered technique and the internal reinforcement Q-learning (IrQL) algorithm to address the tracking control issue of unknown nonlinear systems with multiple agents (MASs). Relying on the internal reinforcement reward (IRR) formula, a Q-learning function is calculated, and then the iteration IRQL method is developed. In contrast to mechanisms triggered by time, an event-triggered algorithm reduces the rate of transmission and computational load, since the controller may only be upgraded when the predetermined triggering circumstances are met. In addition, in order to implement the suggested system, a neutral reinforce-critic-actor (RCA) network structure is created that may assess the indices of performance and online learning of the event-triggering mechanism. This strategy is intended to be data-driven without having in-depth knowledge of system dynamics. We must develop the event-triggered weight tuning rule, which only modifies the parameters of the actor neutral network (ANN) in response to triggering cases. In addition, a Lyapunov-based convergence study of the reinforce-critic-actor neutral network (NN) is presented. Lastly, an example demonstrates the accessibility and efficiency of the suggested approach.

APA, Harvard, Vancouver, ISO, and other styles

33

Wang, Huitao, Ruopeng Yang, Changsheng Yin, Xiaofei Zou, and Xuefeng Wang. "Research on the Difficulty of Mobile Node Deployment’s Self-Play in Wireless Ad Hoc Networks Based on Deep Reinforcement Learning." Wireless Communications and Mobile Computing 2021 (March 9, 2021): 1–13. http://dx.doi.org/10.1155/2021/4361650.

Full text

Abstract:

Deep reinforcement learning is one kind of machine learning algorithms which uses the maximum cumulative reward to learn the optimal strategy. The difficulty is how to ensure the fast convergence of the model and generate a large number of sample data to promote the model optimization. Using the deep reinforcement learning framework of the AlphaZero algorithm, the deployment problem of wireless nodes in wireless ad hoc networks is equivalent to the game of Go. A deployment model of mobile nodes in wireless ad hoc networks based on the AlphaZero algorithm is designed. Because the application scenario of wireless ad hoc network does not have the characteristics of chessboard symmetry and invariability, it cannot expand the data sample set by rotating and changing the chessboard orientation. The strategy of dynamic updating learning rate and the method of selecting the latest model to generate sample data are used to solve the problem of fast model convergence.

APA, Harvard, Vancouver, ISO, and other styles

34

Saleem, Muhammad, Yasir Saleem, H. M. Shahzad Asif, and M. Saleem Mian. "Quality Enhanced Multimedia Content Delivery for Mobile Cloud with Deep Reinforcement Learning." Wireless Communications and Mobile Computing 2019 (July 18, 2019): 1–15. http://dx.doi.org/10.1155/2019/5038758.

Full text

Abstract:

The importance of multimedia streaming using mobile devices has increased considerably. The dynamic adaptive streaming over HTTP is an efficient scheme for bitrate adaptation in which video is segmented and stored in different quality levels. The multimedia streaming with limited bandwidth and varying network environment for mobile users affects the user quality of experience. We have proposed an adaptive rate control using enhanced Double Deep Q-Learning approach to improve multimedia content delivery by switching quality level according to the network, device, and environment conditions. The proposed algorithm is thoroughly evaluated against state-of-the-art heuristic and learning-based algorithms. The performance metrics such as PSNR, SSIM, quality of experience, rebuffering frequency, and quality variations are evaluated. The results are obtained using real network traces which shows that the proposed algorithm outperforms the other schemes in all considered quality metrics. The proposed algorithm provides faster convergence to the optimal solution as compared to other algorithms considered in our work.

APA, Harvard, Vancouver, ISO, and other styles

35

Wang, Qiulin, Baole Tao, Fulei Han, and Wenting Wei. "Extraction and Recognition Method of Basketball Players’ Dynamic Human Actions Based on Deep Learning." Mobile Information Systems 2021 (June 26, 2021): 1–6. http://dx.doi.org/10.1155/2021/4437146.

Full text

Abstract:

The extraction and recognition of human actions has always been a research hotspot in the field of state recognition. It has a wide range of application prospects in many fields. In sports, it can reduce the occurrence of accidental injuries and improve the training level of basketball players. How to extract effective features from the dynamic body movements of basketball players is of great significance. In order to improve the fairness of the basketball game, realize the accurate recognition of the athletes’ movements, and simultaneously improve the level of the athletes and regulate the movements of the athletes during training, this article uses deep learning to extract and recognize the movements of the basketball players. This paper implements human action recognition algorithm based on deep learning. This method automatically extracts image features through convolution kernels, which greatly improves the efficiency compared with traditional manual feature extraction methods. This method uses the deep convolutional neural network VGG model on the TensorFlow platform to extract and recognize human actions. On the Matlab platform, the KTH and Weizmann datasets are preprocessed to obtain the input image set. Then, the preprocessed dataset is used to train the model to obtain the optimal network model and corresponding data by testing the two datasets. Finally, the two datasets are analyzed in detail, and the specific cause of each action confusion is given. Simultaneously, the recognition accuracy and average recognition accuracy rates of each action category are calculated. The experimental results show that the human action recognition algorithm based on deep learning obtains a higher recognition accuracy rate.

APA, Harvard, Vancouver, ISO, and other styles

36

Maldonado, Bryan P., Nan Li, Ilya Kolmanovsky, and Anna G. Stefanopoulou. "Learning reference governor for cycle-to-cycle combustion control with misfire avoidance in spark-ignition engines at high exhaust gas recirculation–diluted conditions." International Journal of Engine Research 21, no. 10 (June 26, 2020): 1819–34. http://dx.doi.org/10.1177/1468087420929109.

Full text

Abstract:

Cycle-to-cycle feedback control is employed to achieve optimal combustion phasing while maintaining high levels of exhaust gas recirculation by adjusting the spark advance and the exhaust gas recirculation valve position. The control development is based on a control-oriented model that captures the effects of throttle position, exhaust gas recirculation valve position, and spark timing on the combustion phasing. Under the assumption that in-cylinder pressure information is available, an adaptive extended Kalman filter approach is used to estimate the exhaust gas recirculation rate into the intake manifold based on combustion phasing measurements. The estimation algorithm is adaptive since the cycle-to-cycle combustion variability (output covariance) is not known a priori and changes with operating conditions. A linear quadratic regulator controller is designed to maintain optimal combustion phasing while maximizing exhaust gas recirculation levels during load transients coming from throttle tip-in and tip-out commands from the driver. During throttle tip-outs, however, a combination of a high exhaust gas recirculation rate and an overly advanced spark, product of the dynamic response of the system, generates a sequence of misfire events. In this work, an explicit reference governor is used as an add-on scheme to the closed-loop system in order to avoid the violation of the misfire limit. The reference governor is enhanced with model-free learning which enables it to avoid misfires after a learning phase. Experimental results are reported which illustrate the potential of the proposed control strategy for achieving an optimal combustion process during highly diluted conditions for improving fuel efficiency.

APA, Harvard, Vancouver, ISO, and other styles

37

Marsetič, Rok, Darja Šemrov, and Marijan Žura. "Road Artery Traffic Light Optimization with Use of the Reinforcement Learning." PROMET - Traffic&Transportation 26, no. 2 (April 26, 2014): 101–8. http://dx.doi.org/10.7307/ptt.v26i2.1318.

Full text

Abstract:

The basic principle of optimal traffic control is the appropriate real-time response to dynamic traffic flow changes. Signal plan efficiency depends on a large number of input parameters. An actuated signal system can adjust very well to traffic conditions, but cannot fully adjust to stochastic traffic volume oscillation. Due to the complexity of the problem analytical methods are not applicable for use in real time, therefore the purpose of this paper is to introduce heuristic method suitable for traffic light optimization in real time. With the evolution of artificial intelligence new possibilities for solving complex problems have been introduced. The goal of this paper is to demonstrate that the use of the Q learning algorithm for traffic lights optimization is suitable. The Q learning algorithm was verified on a road artery with three intersections. For estimation of the effectiveness and efficiency of the proposed algorithm comparison with an actuated signal plan was carried out. The results (average delay per vehicle and the number of vehicles that left road network) show that Q learning algorithm outperforms the actuated signal controllers. The proposed algorithm converges to the minimal delay per vehicle regardless of the stochastic nature of traffic. In this research the impact of the model parameters (learning rate, exploration rate, influence of communication between agents and reward type) on algorithm effectiveness were analysed as well.

APA, Harvard, Vancouver, ISO, and other styles

38

Mayer, Polina N., Victor V. Pogorelko, Dmitry S. Voronin, and Alexander E. Mayer. "Spall Fracture of Solid and Molten Copper: Molecular Dynamics, Mechanical Model and Strain Rate Dependence." Metals 12, no. 11 (November 3, 2022): 1878. http://dx.doi.org/10.3390/met12111878.

Full text

Abstract:

In this study, we formulate a mechanical model of spall fracture of copper, which describes both solid and molten states. The model is verified, and its parameters are found based on the data of molecular dynamics simulations of this process under ultrahigh strain rate of tension, leading to the formation of multiple pores within the considered volume element. A machine-learning-type Bayesian algorithm is used to identify the optimal parameters of the model. We also analyze the influence of the initial size distribution of pores or non-wettable inclusions in copper on the strain rate dependence of its spall strength and show that these initial heterogeneities explain the existing experimental data for moderate strain rates. This investigation promotes the development of atomistically-based machine learning approaches to description of the strength properties of metals and deepens the understanding of the spall fracture process.

APA, Harvard, Vancouver, ISO, and other styles

39

Yazid, Yassine, Antonio Guerrero-González, Imad Ez-Zazi, Ahmed El Oualkadi, and Mounir Arioua. "A Reinforcement Learning Based Transmission Parameter Selection and Energy Management for Long Range Internet of Things." Sensors 22, no. 15 (July 28, 2022): 5662. http://dx.doi.org/10.3390/s22155662.

Full text

Abstract:

Internet of Things (IoT) landscape to cover long-range applications. The LoRa-enabled IoT devices adopt an Adaptive Data Rate-based (ADR) mechanism to assign transmission parameters such as spreading factors, transmission energy, and coding rates. Nevertheless, the energy assessment of these combinations should be considered carefully to select an accurate combination. Accordingly, the computational and transmission energy consumption trade-off should be assessed to guarantee the effectiveness of the physical parameter tuning. This paper provides comprehensive details of LoRa transceiver functioning mechanisms and provides a mathematical model for energy consumption estimation of the end devices EDs. Indeed, in order to select the optimal transmission parameters. We have modeled the LoRa energy optimization and transmission parameter selection problem as a Markov Decision Process (MDP). The dynamic system surveys the environment stats (the residual energy and channel state) and searches for the optimal actions to minimize the long-term average cost at each time slot. The proposed method has been evaluated under different scenarios and then compared to LoRaWAN default ADR in terms of energy efficiency and reliability. The numerical results have shown that our method outperforms the LoRa standard ADR mechanism since it permits the EDs to gain more energy. Besides, it enables the EDs to stand more, consequently performing more transmissions.

APA, Harvard, Vancouver, ISO, and other styles

40

Chang, Chung-Ho, and Jen-Ming Chen. "Capacity Policy for an OEM under Production Ramp-Up and Demand Diffusion." Mathematical Problems in Engineering 2022 (May 26, 2022): 1–22. http://dx.doi.org/10.1155/2022/9510184.

Full text

Abstract:

This paper presents a study on capacity policy for an OEM, launching a new short life cycle consumer durable product, and undergoes a demand diffusion process and a production ramp-up process with learning. The demand/supply dynamic system is described by the Bass diffusion demand rate and time constant production rate. The mismatch of demand peak and production plateau creates challenges for balancing the interactive trajectories. Indivisible lumpy capacity increments in the production network exacerbate the difficulty. In conjunction with mathematical and graphical analysis and computation power, we model a discrete optimization model (running on CPLEX) to investigate “how the OEM should determine an optimal capacity size” and “when to market the new product and expand capacity” under various Bass parameters, which describe the behavior intensity of the innovation and imitation of new product consumers.

APA, Harvard, Vancouver, ISO, and other styles

41

Li, Shu, Jiong Yu, Xusheng Du, Yi Lu, and Rui Qiu. "Fair Outlier Detection Based on Adversarial Representation Learning." Symmetry 14, no. 2 (February 9, 2022): 347. http://dx.doi.org/10.3390/sym14020347.

Full text

Abstract:

Outlier detection aims to identify rare, minority objects in a dataset that are significantly different from the majority. When a minority group (defined by sensitive attributes, such as gender, race, age, etc.) does not represent the target group for outlier detection, outlier detection methods are likely to propagate statistical biases in the data and generate unfair results. Our work focuses on studying the fairness of outlier detection. We characterize the properties of fair outlier detection and propose an appropriate outlier detection method that combines adversarial representation learning and the LOF algorithm (AFLOF). Unlike the FairLOF method that adds fairness constraints to the LOF algorithm, AFLOF uses adversarial networks to learn the optimal representation of the original data while hiding the sensitive attribute in the data. We introduce a dynamic weighting module that assigns lower weight values to data objects with higher local outlier factors to eliminate the influence of outliers on representation learning. Lastly, we conduct comparative experiments on six publicly available datasets. The results demonstrate that compared to the density-based LOF method and the recently proposed FairLOF method, our proposed AFLOF method has a significant advantage in both the outlier detection performance and fairness.

APA, Harvard, Vancouver, ISO, and other styles

42

Zhang, Zhen, and Dongqing Wang. "EAQR: A Multiagent Q-Learning Algorithm for Coordination of Multiple Agents." Complexity 2018 (August 28, 2018): 1–14. http://dx.doi.org/10.1155/2018/7172614.

Full text

Abstract:

We propose a cooperative multiagent Q-learning algorithm called exploring actions according to Q-value ratios (EAQR). Our aim is to design a multiagent reinforcement learning algorithm for cooperative tasks where multiple agents need to coordinate their behavior to achieve the best system performance. In EAQR, Q-value represents the probability of getting the maximal reward, while each action is selected according to the ratio of its Q-value to the sum of all actions’ Q-value and the exploration rate ε. Seven cooperative repeated games are used as cases to study the dynamics of EAQR. Theoretical analyses show that in some cases the optimal joint strategies correspond to the stable critical points of EAQR. Moreover, comparison experiments on stochastic games with finite steps are conducted. One is the box-pushing, and the other is the distributed sensor network problem. Experimental results show that EAQR outperforms the other algorithms in the box-pushing problem and achieves the theoretical optimal performance in the distributed sensor network problem.

APA, Harvard, Vancouver, ISO, and other styles

43

Kim, Sang-Ho, Deog-Yeong Park, and Ki-Hoon Lee. "Hybrid Deep Reinforcement Learning for Pairs Trading." Applied Sciences 12, no. 3 (January 18, 2022): 944. http://dx.doi.org/10.3390/app12030944.

Full text

Abstract:

Pairs trading is an investment strategy that exploits the short-term price difference (spread) between two co-moving stocks. Recently, pairs trading methods based on deep reinforcement learning have yielded promising results. These methods can be classified into two approaches: (1) indirectly determining trading actions based on trading and stop-loss boundaries and (2) directly determining trading actions based on the spread. In the former approach, the trading boundary is completely dependent on the stop-loss boundary, which is certainly not optimal. In the latter approach, there is a risk of significant loss because of the absence of a stop-loss boundary. To overcome the disadvantages of the two approaches, we propose a hybrid deep reinforcement learning method for pairs trading called HDRL-Trader, which employs two independent reinforcement learning networks; one for determining trading actions and the other for determining stop-loss boundaries. Furthermore, HDRL-Trader incorporates novel techniques, such as dimensionality reduction, clustering, regression, behavior cloning, prioritized experience replay, and dynamic delay, into its architecture. The performance of HDRL-Trader is compared with the state-of-the-art reinforcement learning methods for pairs trading (P-DDQN, PTDQN, and P-Trader). The experimental results for twenty stock pairs in the Standard & Poor’s 500 index show that HDRL-Trader achieves an average return rate of 82.4%, which is 25.7%P higher than that of the second-best method, and yields significantly positive return rates for all stock pairs.

APA, Harvard, Vancouver, ISO, and other styles

44

Hoppe, David, and Constantin A. Rothkopf. "Learning rational temporal eye movement strategies." Proceedings of the National Academy of Sciences 113, no. 29 (July 5, 2016): 8332–37. http://dx.doi.org/10.1073/pnas.1601305113.

Full text

Abstract:

During active behavior humans redirect their gaze several times every second within the visual environment. Where we look within static images is highly efficient, as quantified by computational models of human gaze shifts in visual search and face recognition tasks. However, when we shift gaze is mostly unknown despite its fundamental importance for survival in a dynamic world. It has been suggested that during naturalistic visuomotor behavior gaze deployment is coordinated with task-relevant events, often predictive of future events, and studies in sportsmen suggest that timing of eye movements is learned. Here we establish that humans efficiently learn to adjust the timing of eye movements in response to environmental regularities when monitoring locations in the visual scene to detect probabilistically occurring events. To detect the events humans adopt strategies that can be understood through a computational model that includes perceptual and acting uncertainties, a minimal processing time, and, crucially, the intrinsic costs of gaze behavior. Thus, subjects traded off event detection rate with behavioral costs of carrying out eye movements. Remarkably, based on this rational bounded actor model the time course of learning the gaze strategies is fully explained by an optimal Bayesian learner with humans’ characteristic uncertainty in time estimation, the well-known scalar law of biological timing. Taken together, these findings establish that the human visual system is highly efficient in learning temporal regularities in the environment and that it can use these regularities to control the timing of eye movements to detect behaviorally relevant events.

APA, Harvard, Vancouver, ISO, and other styles

45

Abdalla, Hemn Barzan, Awder M. Ahmed, Subhi R. M. Zeebaree, Ahmed Alkhayyat, and Baha Ihnaini. "Rider weed deep residual network-based incremental model for text classification using multidimensional features and MapReduce." PeerJ Computer Science 8 (March 31, 2022): e937. http://dx.doi.org/10.7717/peerj-cs.937.

Full text

Abstract:

Increasing demands for information and the rapid growth of big data have dramatically increased the amount of textual data. In order to obtain useful text information, the classification of texts is considered an imperative task. Accordingly, this article will describe the development of a hybrid optimization algorithm for classifying text. Here, pre-processing was done using the stemming process and stop word removal. Additionally, we performed the extraction of imperative features and the selection of optimal features using the Tanimoto similarity, which estimates the similarity between features and selects the relevant features with higher feature selection accuracy. Following that, a deep residual network trained by the Adam algorithm was utilized for dynamic text classification. Dynamic learning was performed using the proposed Rider invasive weed optimization (RIWO)-based deep residual network along with fuzzy theory. The proposed RIWO algorithm combines invasive weed optimization (IWO) and the Rider optimization algorithm (ROA). These processes are carried out under the MapReduce framework. Our analysis revealed that the proposed RIWO-based deep residual network outperformed other techniques with the highest true positive rate (TPR) of 85%, true negative rate (TNR) of 94%, and accuracy of 88.7%.

APA, Harvard, Vancouver, ISO, and other styles

46

Khanh, Tran Trong, Tran Hoang Hai, Md Delowar Hossain, and Eui-Nam Huh. "Fuzzy-Assisted Mobile Edge Orchestrator and SARSA Learning for Flexible Offloading in Heterogeneous IoT Environment." Sensors 22, no. 13 (June 23, 2022): 4727. http://dx.doi.org/10.3390/s22134727.

Full text

Abstract:

In the era of heterogeneous 5G networks, Internet of Things (IoT) devices have significantly altered our daily life by providing innovative applications and services. However, these devices process large amounts of data traffic and their application requires an extremely fast response time and a massive amount of computational resources, leading to a high failure rate for task offloading and considerable latency due to congestion. To improve the quality of services (QoS) and performance due to the dynamic flow of requests from devices, numerous task offloading strategies in the area of multi-access edge computing (MEC) have been proposed in previous studies. Nevertheless, the neighboring edge servers, where computational resources are in excess, have not been considered, leading to unbalanced loads among edge servers in the same network tier. Therefore, in this paper, we propose a collaboration algorithm between a fuzzy-logic-based mobile edge orchestrator (MEO) and state-action-reward-state-action (SARSA) reinforcement learning, which we call the Fu-SARSA algorithm. We aim to minimize the failure rate and service time of tasks and decide on the optimal resource allocation for offloading, such as a local edge server, cloud server, or the best neighboring edge server in the MEC network. Four typical application types, healthcare, AR, infotainment, and compute-intensive applications, were used for the simulation. The performance results demonstrate that our proposed Fu-SARSA framework outperformed other algorithms in terms of service time and the task failure rate, especially when the system was overloaded.

APA, Harvard, Vancouver, ISO, and other styles

47

Jegminat, Jannes, Simone Carlo Surace, and Jean-Pascal Pfister. "Learning as filtering: Implications for spike-based plasticity." PLOS Computational Biology 18, no. 2 (February 23, 2022): e1009721. http://dx.doi.org/10.1371/journal.pcbi.1009721.

Full text

Abstract:

Most normative models in computational neuroscience describe the task of learning as the optimisation of a cost function with respect to a set of parameters. However, learning as optimisation fails to account for a time-varying environment during the learning process and the resulting point estimate in parameter space does not account for uncertainty. Here, we frame learning as filtering, i.e., a principled method for including time and parameter uncertainty. We derive the filtering-based learning rule for a spiking neuronal network—the Synaptic Filter—and show its computational and biological relevance. For the computational relevance, we show that filtering improves the weight estimation performance compared to a gradient learning rule with optimal learning rate. The dynamics of the mean of the Synaptic Filter is consistent with spike-timing dependent plasticity (STDP) while the dynamics of the variance makes novel predictions regarding spike-timing dependent changes of EPSP variability. Moreover, the Synaptic Filter explains experimentally observed negative correlations between homo- and heterosynaptic plasticity.

APA, Harvard, Vancouver, ISO, and other styles

48

Hrizi, Olfa, Karim Gasmi, Ibtihel Ben Ltaifa, Hamoud Alshammari, Hanen Karamti, Moez Krichen, Lassaad Ben Ammar, and Mahmood A. Mahmood. "Tuberculosis Disease Diagnosis Based on an Optimized Machine Learning Model." Journal of Healthcare Engineering 2022 (March 21, 2022): 1–13. http://dx.doi.org/10.1155/2022/8950243.

Full text

Abstract:

Computer science plays an important role in modern dynamic health systems. Given the collaborative nature of the diagnostic process, computer technology provides important services to healthcare professionals and organizations, as well as to patients and their families, researchers, and decision-makers. Thus, any innovations that improve the diagnostic process while maintaining quality and safety are crucial to the development of the healthcare field. Many diseases can be tentatively diagnosed during their initial stages. In this study, all developed techniques were applied to tuberculosis (TB). Thus, we propose an optimized machine learning-based model that extracts optimal texture features from TB-related images and selects the hyper-parameters of the classifiers. Increasing the accuracy rate and minimizing the number of characteristics extracted are our goals. In other words, this is a multitask optimization issue. A genetic algorithm (GA) is used to choose the best features, which are then fed into a support vector machine (SVM) classifier. Using the ImageCLEF 2020 data set, we conducted experiments using the proposed approach and achieved significantly higher accuracy and better outcomes in comparison with the state-of-the-art works. The obtained experimental results highlight the efficiency of modified SVM classifier compared with other standard ones.

APA, Harvard, Vancouver, ISO, and other styles

49

Zhang, Huanan, and Stefanus Jasin. "Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers." Manufacturing & Service Operations Management 24, no. 2 (March 2022): 1165–82. http://dx.doi.org/10.1287/msom.2021.0979.

Full text

Abstract:

Problem definition: We consider the problem of joint learning and optimization of cyclic pricing policies in the presence of patient customers. In our problem, some customers are patient, and they are willing to wait in the system for several periods to make a purchase until the price is lower than their valuation. The seller does not know the joint distribution of customers’ valuation and patience level a priori and can only learn this from the realized total sales in every period. Academic/practical relevance: The revenue management problem with patient customers has been studied in the literature as an optimization problem, and cyclic policy has been shown to be optimal in some cases. We contribute to the literature by studying this problem from the joint learning and optimization perspective. Indeed, to the best of our knowledge, our paper is the first work that studies online learning and optimization for multiperiod pricing with patient customers. Methodology: We introduce new dynamic programming formulations for this problem, and we develop two nontrivial upper confidence bound–based learning algorithms. Results: We analyze both decreasing cyclic policies and so-called threshold-regulated policies, which contain both the decreasing cyclic policies and the nested decreasing cyclic policies. We show that our learning algorithms for these policies converge to the optimal clairvoyant decreasing cyclic policy and threshold-regulated policy at a near-optimal rate. Managerial implications: Our proposed algorithms perform significantly better than benchmark algorithms that either ignore the patient customer characteristic or simply use the standard estimate-then-optimize framework, which does not encourage enough exploration; this highlights the importance of “smart learning” in the context of data-driven decision making. In addition, our numerical results also show that combining our algorithms with smart estimation methods, such as linear interpolation or least square estimation, can significantly improve their empirical performance; this highlights the benefit of combining smart learning with smart estimation, which further increases the practical viability of the algorithms.

APA, Harvard, Vancouver, ISO, and other styles

50

Zheng, Shaoxiong, Peng Gao, Weixing Wang, and Xiangjun Zou. "A Highly Accurate Forest Fire Prediction Model Based on an Improved Dynamic Convolutional Neural Network." Applied Sciences 12, no. 13 (July 2, 2022): 6721. http://dx.doi.org/10.3390/app12136721.

Full text

Abstract:

In this work, an improved dynamic convolutional neural network (DCNN) model to accurately identify the risk of a forest fire was established based on the traditional DCNN model. First, the DCNN network model was trained in combination with transfer learning, and multiple pre-trained DCNN models were used to extract features from forest fire images. Second, principal component analysis (PCA) reconstruction technology was used in the appropriate subspace. The constructed 15-layer forest fire risk identification DCNN model named “DCN_Fire” could accurately identify core fire insurance areas. Moreover, the original and enhanced image data sets were used to evaluate the impact of data enhancement on the model’s accuracy. The traditional DCNN model was improved and the recognition speed and accuracy were compared and analyzed with the other three DCNN model algorithms with different architectures. The difficulty of using DCNN to monitor forest fire risk was solved, and the model’s detection accuracy was further improved. The true positive rate was 7.41% and the false positive rate was 4.8%. When verifying the impact of different batch sizes and loss rates on verification accuracy, the loss rate of the DCN_Fire model of 0.5 and the batch size of 50 provided the optimal value for verification accuracy (0.983). The analysis results showed that the improved DCNN model had excellent recognition speed and accuracy and could accurately recognize and classify the risk of a forest fire under natural light conditions, thereby providing a technical reference for preventing and tackling forest fires.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!