王逍,硕士生导师,2015年获北京理工大学学士学位,2021年获北京航空航天大学博士学位,从事大模型、具身智能、飞行器动力学与控制、航天器集群自主任务规划等方面的研究。截至目前,已发表相关学术论文18篇,其中以第一作者在IEEE Transactions on Aerospace and Electronic Systems、Neurocomputing、Advances in Space Research等期刊发表SCI学术论文10篇,EI及核心论文5篇;申请发明专利8项,已授权5项。主持省部级纵向及“星群算力聚合与任务规划软件开发”、“基于多智能体强化学习的航天器动态博弈对抗技术研究”、“地月空间自主任务规划模块开发”、“动态不确定空间目标的时空因果推理决策方法研究”等纵横向项目。
教学课程
自动控制原理 | 本科生 |
智能控制理论及应用 | 研究生 |
学术论文
[1] Wang X, Li D, Bioinspired Actor-Critic Algorithm for Reinforcement Learning Interpretation with Levy-Brown Hybrid Exploration Strategy, Neurocomputing, 2024, 574(Mar.14):1.1-1.16.
[2] Wang, X., Ma, Z., Cao, L. et al. A planar tracking strategy based on multiple-interpretable improved PPO algorithm with few-shot technique. Sci Rep, 2024, 14: 3910. https://doi.org/10.1038/s41598-024-54268-6.
[3] Xiao Wang, Zhuo Yang, Yuying Han, Hao Li, Peng Shi,Method of sequential intention inference for a space target based on meta-fuzzy decision tree,Advances in Space Research,2024,,ISSN 0273-1177,https://doi.org/10.1016/j.asr.2024.06.049.
[4] Xiao Wang, Jiake Li, Lu Cao, Dechao Ran, Mingjiang Ji, Kewu Sun, Yuying Han, Zhe Ma,A data-knowledge joint-driven reinforcement learning algorithm based on guided policy and state-prediction for satellite continuous-thrust tracking,Advances in Space Research,2024,ISSN 0273-1177,https://doi.org/10.1016/j.asr.2024.06.070.
[5] Wang X. , Shi P. , Wen C. and Zhao Y. Design of Parameter-Self-Tuning Controller Based on Reinforcement Learning for Tracking Noncooperative Targets in Space[J]. IEEE Transactions on Aerospace and Electronic Systems, 56(6): 4192-4208, Dec. 2020, doi: 10.1109/TAES.2020.2988170.
[6] Wang X , Shi P , Zhao Y , et al. A Pre-Trained Fuzzy Reinforcement Learning Method for the Pursuing Satellite in a One-to-One Game in Space[J]. Sensors, 2020, 20(8):2253.
[7] Wang X , Shi P , Wen C , et al. An Algorithm of Reinforcement Learning for Maneuvering Parameter Self-Tuning Applying in Satellite Cluster[J]. Mathematical Problems in Engineering, 2020, 2020(5):1-17.
[8] Wang X , Shi P , Schwartz H , et al. An algorithm of pretrained fuzzy actor–critic learning applying in fixed-time space differential game[J]. Proceedings of the Institution of Mechanical Engineers Part G Journal of Aerospace Engineering, 2021, 235(14):2095-2112.
[9] Wang X, Ma Z, Mao L, Sun K, Huang X, Fan C, Li J. Accelerating Fuzzy Actor–Critic Learning via Suboptimal Knowledge for a Multi-Agent Tracking Problem. Electronics, 2023,12(8):1852.
[10] Wang X,Yang Z, Bai X, Ji M, Li H, Ran D. A Consistent Round-Up Strategy Based on PPO Path Optimization for the Leader–Follower Tracking Problem. Sensors. 2023, 23. 8814.
[11] Li D,Zhu F,Wang X,Jin Q. Multi-objective reinforcement learning for fed-batch fermentation process control[J].Journal of Process Control, 2022,115(11):89-99.
[12] Song L,Li D,Wang X, Xu X. AdaBoost Maximum Entropy Deep Inverse Reinforcement Learning with Truncated Gradient[J].Information Sciences, 2022,602(2).
[13] Wang, X., Han, Y., Tang, M., Zhang, F. (2025). Robust Orbital Game Policy in Multiple Disturbed Environments: An Approach Based on Causality Diversity Maximal Marginal Relevance Algorithm. In: Liu, L., Niu, Y., Fu, W., Qu, Y. (eds) Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems (4th ICAUS 2024). ICAUS 2024. Lecture Notes in Electrical Engineering, vol 1374. Springer, Singapore.
[14] Xiao Wang, Lishuo Wang, Chen Zhang, Hao Zhang,Dazi Li,Robust orbital game policy in multiple disturbed environments: an approach based on causality diversity maximal marginal relevance algorithm,C2-CHINA 2025(Accept).
[15] 王逍,温昶煊,赵育善,师鹏.翻滚目标安全走廊内的碰撞可能性判断方法[J].哈尔滨工业大学学报,2018,50(04):94-101+187.
[16] 王逍,师鹏,温昶煊,赵育善. 非合作目标近距离停靠的自适应位姿联合控制[C]. 中国自动化学会控制理论专业委员会.第36届中国控制会议论文集(C).中国自动化学会控制理论专业委员会:中国自动化学会控制理论专业委员会,2017:129-134.
[17] 王逍,师鹏,温昶煊,赵育善.对失效卫星特征点的自适应位姿跟踪控制[J].中国空间科学技术,2018,38(01):8-17.