A Reinforcementment Learning-based Bi-Objective Routing Algorithm for Energy Harvesting Mobile Ad-hoc Networks
Paper ID : 1316-IST
1Vesal Hakami *, 1Meisam Maleki, 2Mehdi Dehghan
1Amirkabir University of Technology
2Amir kabir university of technology
Dynamic topology, lack of a fixed infrastructure and limited energy in mobile ad-hoc networks (MANETs) give rise to a challenging operational environment. MANET routing protocols should consider dynamic network changes (e.g., link qualities and nodes’ residual energy) in such circumstances and be able to adapt to these changes to efficiently handle the traffic flows. In this paper, we present a bi-objective intelligent routing protocol that aims at reducing an expected long-run cost function composed of end-to-end delay and the path energy cost. We formulate the routing problem as a Markov decision process (MDP) which captures both the link state dynamics due to node mobility and energy state dynamics due to nodes’ rechargeable energy sources. We propose a reinforcement learning-based algorithm to approximate the optimal routing policy in the absence of a priori knowledge of the system statistics. We compare the performance of the proposed scheme with that obtained from a value-iteration-based algorithm which assumes perfect statistics.
MANET; routing; Reinforcement Learning; MDP; end to end delay; network life time