一种基于深度强化学习的动态路由算法

信息通信技术与政策 ›› 2020, Vol. 46 ›› Issue (9): 48-54.

一种基于深度强化学习的动态路由算法

A dynamic routing algorithm based on deep reinforcement learning

北京邮电大学人工智能学院,北京 100876

出版日期:2020-09-15 发布日期:2020-11-05
作者简介:
肖扬:北京邮电大学人工智能学院智能感知与计算教研中心博士研究生,主要研究基于强化学习的自主网络吴家威:北京邮电大学人工智能学院智能感知与计算教研中心硕士研究生,主要研究基于强化学习的网络路由算法李鉴学:北京邮电大学人工智能学院智能感知与计算教研中心硕士研究生,主要研究基于强化学习的网络路由算法刘军:北京邮电大学人工智能学院智能感知与计算教研中心副教授,博士生导师,北京邮电大学数据科学中心主任,北京大数据协会常务理事,主要研究基于强化学习的自主路由

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China

Online:2020-09-15 Published:2020-11-05

摘要/Abstract

摘要：

路由是网络基础架构稳定运行的保障,是支撑下一代网络持续发展的关键功能。如今,网络流量的快速增长和服务需求的不断变化使传统路由算法面临严峻的挑战。近年来,深度强化学习在解决复杂连续控制问题上表现出良好的效果。为了解决传统路由算法的一系列弊端,将深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)与路由场景相结合,提出一种基于深度强化学习的新型动态路由算法(DDPG4Net);随后,在自行开发的网络模拟器RL4Net 上对该算法的效果进行了验证。

关键词: 深度强化学习, 路由算法, 网络流量工程

Abstract:

Routing maintains the stable operation of network infrastructure and supports the sustainable development of next-generation networks. Nowadays, the rapid growth of network traffic and the continuous changes in network services make traditional routing algorithms face severe challenges. In recent years, deep reinforcement learning has shown good results in solving complex continuous control problems. In order to solve a series of shortcomings of traditional routing algorithms, we combined the Deep Deterministic Policy Gradient (DDPG) algorithm with routing scenarios, and proposed a new dynamic routing algorithm based on deep reinforcement learning—DDPG4Net.

Key words: deep reinforcement learning, routing algorithm, network traffic engineering

肖扬, 吴家威, 李鉴学, 刘军. 一种基于深度强化学习的动态路由算法[J]. 信息通信技术与政策, 2020, 46(9): 48-54.

XIAO Yang, WU Jiawei, LI Jianxue, LIU Jun. A dynamic routing algorithm based on deep reinforcement learning[J]. Information and Communications Technology and Policy, 2020, 46(9): 48-54.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: http://ictp.caict.ac.cn/CN/Y2020/V46/I9/48

参考文献 10

[1]	N. McKeown, T. Anderson, H. Balakrishnan, et al.OpenFlow: enabling innovation in campus networks[J].Acm Sigcomm Computer Communication, 2008,38(2):69-74.
[2]	T. P. Lillicrap, J. J. Hunt, A. Pritzel, et al.Continuous control with deep reinforcement learning[R].The 4th International Conference on Learning Representations, 2016
[3]	G. Stampa, M. Arias, D. Sanchez-Charles, et al. A deep-reinforcement learning approach for software-defined networking routing optimization [ J ]. arXiv: 1709.07080, 2017.
[4]	C. Yu, J. Lan, Z. Guo, et al. DROM: optimizing the routing in software-defined networks with deep reinforcement learning [ J ]. IEEE Access, 2018, 6:64533-64539.
[5]	X. Huang, T. Yuan, G. Qiao, et al. Deep reinforcement learning for multimedia traffic control in software defined
	networking[J]. IEEE Network, 2018,32(6):35-41.
[6]	T. A. Q. Pham, Y. Hadjadj-Aoul, A. Outtagarts. Deep reinforcement learning based QoS-aware routing in
	knowledge-defined networking [ J ] Lect. Notes Inst.Comput. Sci. Soc. Telecommun. Eng. LNICST, 2019,272:14-26.
[7]	P. Sun, J. Li, J. Lan, et al. RNN deep reinforcement learning for routing optimization[R]. 2018 IEEE 4th Int.Conf,2018.
[8]	D. Silver, G. Lever, N. Heess, et al. Deterministic policy gradient algorithms [ R]. The 31st International Conference on Machine Learning, 2014.
[9]	Klaus Wehrle, Mesut Gnes, James Gross. Modeling and tools for network simulation [M]. Springer Publishing Company, Incorporated, 2010.
[10]	GitHub. RL4Net. [2020-08-05]. https://github.com/bupt-ipcr/RL4Net.

一种基于深度强化学习的动态路由算法

A dynamic routing algorithm based on deep reinforcement learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献 10

相关文章 0

编辑推荐

Metrics

本文评价