西南交通大学教师主页 xinghuanlai--Home-- A DRL Agent for Jointly Optimizing Computation Offloading and Resource Allocation in MEC

xinghuanlai Associate Professor

Supervisor of Doctorate Candidates

Supervisor of Master's Candidates

Education Level: PhD graduate

Professional Title: Associate Professor

Alma Mater: 英国诺丁汉大学

Supervisor of Doctorate Candidates

Supervisor of Master's Candidates

School/Department: 计算机与人工智能学院

Discipline:Communications and Information Systems
Computer Science and Technology

MORE>

Language：中文

Paper Publications

A DRL Agent for Jointly Optimizing Computation Offloading and Resource Allocation in MEC

DOI number:10.1109/JIOT.2021.3081694

Affiliation of Author(s):Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence

Journal:IEEE Internet of Things Journal

Key Words:Task analysis,Resource management,Optimization,Training,Energy consumption,Computational modelingServers,Computation offloading,deep deterministic policy gradient (DDPG),deep reinforcement learning (DRL),mobile-edge computing (MEC),resource allocation

Abstract:This article studies the joint optimization problem of computation offloading and resource allocation (JCORA) in mobile-edge computing (MEC). Deep reinforcement learning (DRL) is one of the ideal techniques for addressing the dynamic JCORA problem. However, it is still challenging to adapt traditional DRL methods for the problem since they usually lead to slow and unstable convergence in model training. To this end, we propose a temporal attentional deterministic policy gradient (TADPG) to tackle JCORA. Based on the deep deterministic policy gradient (DDPG), TADPG has two significant features. First, a temporal feature extraction network consisting of a 1-D convolution (Conv1D) residual block and an attentional long short-term memory (LSTM) network is designed, which is beneficial to high-quality state representation and function approximation. Second, a rank-based prioritized experience replay (rPER) method is devised to accelerate and stabilize the convergence of model training. Experimental results demonstrate that the decentralized TADPG-based mechanism can achieve more efficient JCORA performance than the centralized one, and the proposed TADPG outperforms a number of state-of-the-art DRL agents in terms of the task completion time and energy consumption.

Co-author:Juan Chen,Huanlai Xing*,Zhiwen Xiao,Lexi Xu*,Tao Tao

Document Code:10.1109/JIOT.2021.3081694

Volume:8

Issue:24

Page Number:17508-17524

ISSN No.:2327-4662

Translation or Not:no

Date of Publication:2022-02-19

Included Journals:SCI

Pre One:A Federated Learning System with Enhanced Feature Extraction for Human Activity Recognition

Next One:Softwarized IP Multicast in the Cloud