王逍

发布人：信息学院发布时间：2022-01-05浏览次数：4517

王逍，硕士生导师，2015年获北京理工大学学士学位，2021年获北京航空航天大学博士学位，从事大模型、具身智能、飞行器动力学与控制、航天器集群自主任务规划等方面的研究。截至目前，已发表相关学术论文18篇，其中以第一作者在IEEE Transactions on Aerospace and Electronic Systems、Neurocomputing、Advances in Space Research等期刊发表SCI学术论文10篇，EI及核心论文5篇；申请发明专利8项，已授权5项。主持省部级纵向及“星群算力聚合与任务规划软件开发”、“基于多智能体强化学习的航天器动态博弈对抗技术研究”、“地月空间自主任务规划模块开发”、“动态不确定空间目标的时空因果推理决策方法研究”等纵横向项目。

教学课程

自动控制原理	本科生
智能控制理论及应用	研究生

学术论文

[1] Wang X, Li D, Bioinspired Actor-Critic Algorithm for Reinforcement Learning Interpretation with Levy-Brown Hybrid Exploration Strategy, Neurocomputing, 2024, 574(Mar.14):1.1-1.16.

[2] Wang, X., Ma, Z., Cao, L. et al. A planar tracking strategy based on multiple-interpretable improved PPO algorithm with few-shot technique. Sci Rep, 2024, 14: 3910. https://doi.org/10.1038/s41598-024-54268-6.

[3] Xiao Wang, Zhuo Yang, Yuying Han, Hao Li, Peng Shi,Method of sequential intention inference for a space target based on meta-fuzzy decision tree,Advances in Space Research,2024,,ISSN 0273-1177,https://doi.org/10.1016/j.asr.2024.06.049.

[4] Xiao Wang, Jiake Li, Lu Cao, Dechao Ran, Mingjiang Ji, Kewu Sun, Yuying Han, Zhe Ma,A data-knowledge joint-driven reinforcement learning algorithm based on guided policy and state-prediction for satellite continuous-thrust tracking,Advances in Space Research,2024,ISSN 0273-1177,https://doi.org/10.1016/j.asr.2024.06.070.

[5] Wang X. , Shi P. , Wen C. and Zhao Y. Design of Parameter-Self-Tuning Controller Based on Reinforcement Learning for Tracking Noncooperative Targets in Space[J]. IEEE Transactions on Aerospace and Electronic Systems, 56(6): 4192-4208, Dec. 2020, doi: 10.1109/TAES.2020.2988170.

[6] Wang X , Shi P , Zhao Y , et al. A Pre-Trained Fuzzy Reinforcement Learning Method for the Pursuing Satellite in a One-to-One Game in Space[J]. Sensors, 2020, 20(8):2253.

[7] Wang X , Shi P , Wen C , et al. An Algorithm of Reinforcement Learning for Maneuvering Parameter Self-Tuning Applying in Satellite Cluster[J]. Mathematical Problems in Engineering, 2020, 2020(5):1-17.

[8] Wang X , Shi P , Schwartz H , et al. An algorithm of pretrained fuzzy actor–critic learning applying in fixed-time space differential game[J]. Proceedings of the Institution of Mechanical Engineers Part G Journal of Aerospace Engineering, 2021, 235(14):2095-2112.

[9] Wang X, Ma Z, Mao L, Sun K, Huang X, Fan C, Li J. Accelerating Fuzzy Actor–Critic Learning via Suboptimal Knowledge for a Multi-Agent Tracking Problem. Electronics, 2023,12(8):1852.

[10] Wang X,Yang Z, Bai X, Ji M, Li H, Ran D. A Consistent Round-Up Strategy Based on PPO Path Optimization for the Leader–Follower Tracking Problem. Sensors. 2023, 23. 8814.

[11] Li D，Zhu F，Wang X，Jin Q. Multi-objective reinforcement learning for fed-batch fermentation process control[J].Journal of Process Control, 2022,115(11):89-99.

[12] Song L,Li D,Wang X, Xu X. AdaBoost Maximum Entropy Deep Inverse Reinforcement Learning with Truncated Gradient[J].Information Sciences, 2022,602(2).

[13] Wang, X., Han, Y., Tang, M., Zhang, F. (2025). Robust Orbital Game Policy in Multiple Disturbed Environments: An Approach Based on Causality Diversity Maximal Marginal Relevance Algorithm. In: Liu, L., Niu, Y., Fu, W., Qu, Y. (eds) Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems (4th ICAUS 2024). ICAUS 2024. Lecture Notes in Electrical Engineering, vol 1374. Springer, Singapore.

[14] Xiao Wang, Lishuo Wang, Chen Zhang, Hao Zhang，Dazi Li，Robust orbital game policy in multiple disturbed environments: an approach based on causality diversity maximal marginal relevance algorithm，C2-CHINA 2025（Accept）.

[15] 王逍,温昶煊,赵育善,师鹏.翻滚目标安全走廊内的碰撞可能性判断方法[J].哈尔滨工业大学学报,2018,50(04):94-101+187.

[16] 王逍,师鹏,温昶煊,赵育善. 非合作目标近距离停靠的自适应位姿联合控制[C]. 中国自动化学会控制理论专业委员会.第36届中国控制会议论文集（C）.中国自动化学会控制理论专业委员会:中国自动化学会控制理论专业委员会,2017:129-134.

[17] 王逍,师鹏,温昶煊,赵育善.对失效卫星特征点的自适应位姿跟踪控制[J].中国空间科学技术,2018,38(01):8-17.

教学机构

王逍