王
逍,硕士生导师,2015年获北京理工大学学士学位,2021年获北京航空航天大学博士学位,从事大模型、具身智能、飞行器动力学与控制、航天器集群自主任务规划等方面的研究。截至目前,已发表相关学术论文18篇,其中以第一作者在IEEETransactions on Aerospace and ElectronicSystems、Neurocomputing、Advancesin SpaceResearch等期刊发表SCI学术论文10篇,EI及核心论文5篇;申请发明专利8项,已授权5项。主持省部级纵向及“星群算力聚合与任务规划软件开发”、“基于多智能体强化学习的航天器动态博弈对抗技术研究”、“地月空间自主任务规划模块开发”、“动态不确定空间目标的时空因果推理决策方法研究”等纵横向项目。
教学课程
| 自动控制原理 | 本科生 |
| 智能控制理论及应用 | 研究生 |
学术论文
[1] Wang X, LiD, Bioinspired Actor-Critic Algorithm for Reinforcement LearningInterpretation with Levy-Brown Hybrid Exploration Strategy,Neurocomputing, 2024, 574(Mar.14):1.1-1.16.
[2] Wang, X.,Ma, Z., Cao, L. et al. A planar tracking strategy based onmultiple-interpretable improved PPO algorithm with few-shottechnique. Sci Rep, 2024, 14: 3910.https://doi.org/10.1038/s41598-024-54268-6.
[3] Xiao Wang,Zhuo Yang, Yuying Han, Hao Li, Peng Shi,Method of sequentialintention inference for a space target based on meta-fuzzy decisiontree,Advances in Space Research,2024,,ISSN0273-1177,https://doi.org/10.1016/j.asr.2024.06.049.
[4] Xiao Wang,Jiake Li, Lu Cao, Dechao Ran, Mingjiang Ji, Kewu Sun, Yuying Han, ZheMa,A data-knowledge joint-driven reinforcement learning algorithmbased on guided policy and state-prediction for satellitecontinuous-thrust tracking,Advances in Space Research,2024,ISSN0273-1177,https://doi.org/10.1016/j.asr.2024.06.070.
[5] Wang X. ,Shi P. , Wen C. and Zhao Y. Design of Parameter-Self-TuningController Based on Reinforcement Learning for TrackingNoncooperative Targets in Space[J]. IEEE Transactions on Aerospaceand Electronic Systems, 56(6): 4192-4208, Dec. 2020, doi:10.1109/TAES.2020.2988170.
[6] Wang X ,Shi P , Zhao Y , et al. A Pre-Trained Fuzzy Reinforcement LearningMethod for the Pursuing Satellite in a One-to-One Game in Space[J].Sensors, 2020, 20(8):2253.
[7] Wang X ,Shi P , Wen C , et al. An Algorithm of Reinforcement Learning forManeuvering Parameter Self-Tuning Applying in Satellite Cluster[J].Mathematical Problems in Engineering, 2020, 2020(5):1-17.
[8] Wang X ,Shi P , Schwartz H , et al. An algorithm of pretrained fuzzyactor–critic learning applying in fixed-time space differentialgame[J]. Proceedings of the Institution of Mechanical Engineers PartG Journal of Aerospace Engineering, 2021, 235(14):2095-2112.
[9] Wang X,Ma Z, Mao L, Sun K, Huang X, Fan C, Li J. Accelerating FuzzyActor–Critic Learning via Suboptimal Knowledge for aMulti-Agent Tracking Problem. Electronics, 2023,12(8):1852.
[10] WangX,Yang Z, Bai X, Ji M, Li H, Ran D. A Consistent Round-UpStrategy Based on PPO Path Optimization for the Leader–FollowerTracking Problem. Sensors. 2023, 23. 8814.
[11] Li D,ZhuF,Wang X,JinQ. Multi-objective reinforcement learning for fed-batch fermentationprocess control[J].Journal of Process Control, 2022,115(11):89-99.
[12] Song L,Li D,WangX, Xu X. AdaBoost Maximum Entropy Deep Inverse ReinforcementLearning with Truncated Gradient[J].Information Sciences,2022,602(2).
[13] Wang, X.,Han, Y., Tang, M., Zhang, F. (2025). Robust Orbital Game Policyin Multiple Disturbed Environments: An Approach Basedon Causality Diversity Maximal Marginal Relevance Algorithm. In:Liu, L., Niu, Y., Fu, W., Qu, Y. (eds) Proceedings of 4th 2024International Conference on Autonomous Unmanned Systems (4th ICAUS2024). ICAUS 2024. Lecture Notes in Electrical Engineering, vol 1374.Springer, Singapore.
[14] Xiao Wang,Lishuo Wang, Chen Zhang, Hao Zhang,DaziLi,Robust orbital game policy inmultiple disturbed environments: an approach based on causalitydiversity maximal marginal relevance algorithm,C2-CHINA2025(Accept).
[15] 王逍,温昶煊,赵育善,师鹏.翻滚目标安全走廊内的碰撞可能性判断方法[J].哈尔滨工业大学学报,2018,50(04):94-101+187.
[16] 王逍,师鹏,温昶煊,赵育善. 非合作目标近距离停靠的自适应位姿联合控制[C]. 中国自动化学会控制理论专业委员会.第36届中国控制会议论文集(C).中国自动化学会控制理论专业委员会:中国自动化学会控制理论专业委员会,2017:129-134.
[17] 王逍,师鹏,温昶煊,赵育善.对失效卫星特征点的自适应位姿跟踪控制[J].中国空间科学技术,2018,38(01):8-17.

