网联自动驾驶汽车感知和控制方法

发布时间:2024-03-25        浏览量:15

时间:2024年3月28日(星期四)14:00-15:00

地点:beat365中国官方网站二楼智慧应急管理实验室

主题:网联自动驾驶汽车感知和控制方法(Perception and Control of Connected Autonomous Vehicles)

主讲人:梁韵逸(beat365中国官方网站)

简介:梁韵逸,交通系统工程系讲师。获同济大学博士学位。攻读博士学位期间曾在美国华盛顿大学接受联合培养。先后在同济大学和德国慕尼黑工业大学任职博士后。主要研究领域包括车路协同系统感知,优化和控制,强化学习和深度学习等。先后主持欧盟玛丽居里学者人才基金,慕尼黑工业大学全球优秀博士后基金,国家自然科学基金青年项目和上海市超级博士后激励计划。以第一作者或通讯作者在相关领域共发表SCI 检索论文10余篇(其中包括IEEE Transactions on Intelligent Transportation Systems等顶级期刊论文多篇),美国科学院交通研究委员会论文10余篇。

Yunyi Liang, a Lecturer with the Department of Transportation System Engineering. Yunyi achieved his PhD degree in from Tongji University, China. After that, he successively worked as postdoc with Tongji University and Technical University of Munich, Germany. He was once a visiting Ph. D student with University of Washington, United States. His research interests include sensing, optimization and control of transportation systems with connected autonomous vehicles, reinforcement learning, and deep learning. He attracted research grants from “Marie Skłodowska-Curie fellowship”, “Technical University of Munich Global Postdoc Fellowship”, “National Science Foundation of China (NSFC) for Young Scientists”, and “Shanghai Super Postdoc Fellowship”. He has authored more than 20 international journal and conference papers in my research areas. These include the papers published in the top journal “IEEE Transactions on Intelligent Transportation Systems”, and the top conference “Transportation Research Board Annual Meeting”.

摘要:本次报告介绍主讲人在网联自动驾驶汽车(CAV)感知和控制领域的研究进展,包括以下3个方面:(1)基于联邦元无监督学习的激光点云去雪噪方法。构建了多阶段小波卷积神经网络,以无监督学习方式实现激光点云数据去雪噪。针对带雪噪激光点云数据稀少的问题,提出联邦学习算法实现各CAV之间训练数据的共享同时保护数据隐私。为克服由于数据分布异质性导致的联邦学习精度不高问题,建立元学习算法对每一辆CAV的多阶段小波卷积神经网络权重进行更新。(2)基于联邦元多智能体强化学习的CAV合流控制方法。将CAV合流控制问题建模为多智能体强化学习模型。每一辆CAV为智能体,其奖励函数为速度、碰撞率相反数、车头时距和加速车道等待时间的加权和。为克服该模型训练对样本需求大的问题,设计联邦学习算法实现智能体之间训练数据的共享同时保护数据隐私。为解决由于数据分布异质性导致的联邦学习精度不高问题,构建元学习算法训练每一辆CAV的模型。(3)基于分层强化学习的CAV换道决策-轨迹联合优化方法。为实现CAV换道决策-轨迹联合优化,将此问题建模为分层强化学习模型。决策层为DDQN强化学习模型,奖励函数为安全、效率、舒适度和换道惩罚的加权和。轨迹层为CARLA仿真模型。设计时间异步差分算法对决策层和轨迹层模型进行交替训练。

This presentation introduces the presenter's research progress in the field of perception and control of connected autonomous vehicles (CAVs), including the following three aspects:(1) A laser point cloud snow noise removal method based on unsupervised learning of federated elements. A multi-stage wavelet convolutional neural network was constructed to achieve laser point cloud data snow noise removal by unsupervised learning. Aiming at the problem of scarce data of laser point cloud with snow noise, the federated learning algorithm is proposed to achieve the sharing of training data among CAVs while protecting data privacy. In order to overcome the problem of low accuracy of federated learning due to the heterogeneity of data distribution, a meta-learning algorithm is established to update the weights of multi-stage wavelet convolutional neural network for each CAV. (2) CAV merging control method based on federated meta multi-agent reinforcement learning. The CAV merging control problem is modelled as a multi-agent reinforcement learning model. Each CAV is an agent, and its reward function is a weighted sum of speed, collision rate inverse, headway time distance and acceleration lane waiting time. To overcome the problem of high sample demand for the training of this model, a federated learning algorithm is designed to achieve the sharing of training data among the agents while protecting data privacy. To solve the problem of low accuracy of federated learning due to the heterogeneity of data distribution, a meta-learning algorithm is constructed to train the model of each CAV. (3) CAV lane-changing decision-trajectory joint optimization method based on hierarchical reinforcement learning. In order to achieve CAV lane changing decision-trajectory joint optimization, the problem is modelled as a hierarchical reinforcement learning model. The decision layer is the DDQN reinforcement learning model, and the reward function is the weighted sum of safety, efficiency, comfort and lane change penalty. The trajectory layer is the CARLA simulation model. A time-asynchronous difference algorithm is designed to train the decision and trajectory layer models alternately.