PhD Position F/M PhD Thesis on RL-based Decision-Making and Planning for Automated Driving

Contract type : Fixed-term contract

Level of qualifications required : Graduate degree or equivalent

Fonction : PhD Position

Context

You will work within the ASTRA team (Automated and Safe TRAnsportation systems), a joint team of scientists from Inria and Valeo led by Fawzi Nashashibi (Inria) and Benazouz Bradai (Valeo).

This team designs models and algorithms for the development of architectures for intelligent transport systems. It is involved in several projects financed by the French National Research Agency, which aim to welcome Valeo employees and recruit young PhD students. 

 

The PhD thesis focuses on decision-making and planning for automated driving, using reinforcement learning (RL). It explores how autonomous vehicles make decisions (strategic, tactical, and operational) and plan their actions while considering safety, comfort constraints, and interactions with other road users.

Decision-making systems must generate collision-free trajectories in dynamic environments while anticipating the movements of other road users. Despite advancements, challenges persist, including improving motion prediction, completeness of decision-making approaches, and enhancing system robustness against environmental data uncertainty.

The use of reinforcement learning (RL) offers promising opportunities to enhance driving policies, trajectory planning, and decision-making processes. Recent studies have demonstrated the effectiveness of RL, particularly in safe autonomous driving, multi-agent traffic management, and real-world deployment scenarios.

Assignment

During the PhD thesis, the general objective is to:

• Develop an RL-based decision-making and planning framework for automated driving systems. It is only natural to begin turning to the standard model for sequential decision making: the Markov Decision Process. At first glance, this framework shines with its simplicity and elegance, but also its apparent generality and representation power. An (observable) state space, a (hierarchical) action space, a (quasi-linear) system dynamics and a (dense) reward function are to be addressed for a large class of behavioural planning tasks.


• Optimize driving policies using RL algorithms to ensure safety, efficiency, and adaptability. The study in [8] explores several fundamental algorithms in Deep RL to improve automated driving performance, namely Proximal Policy Optimization (PPO), Deep Q-network (DQN), and Deep Deterministic Policy Gradient (DDPG). The paper documents a comparative analysis of these three prominent algorithms—based on their speed, accuracy, and overall performance. After a thorough evaluation, the research indicates that the DQN surpassed the other existing algorithms

• Evaluate the performance of the proposed RL system through simulations and real-world testing.
This is achieved regarding the following key points:
– Create a diverse set of driving scenarios representative of real-world conditions, including highway driving, urban environments, intersections, pedestrian crossings, adverse weather conditions, etc.
– Integrate the RL algorithm into a simulation/testing environment including real vehicle dynamics, sensor inputs, environmental factors, and interactions with other agents (e.g., vehicles, pedestrians).
– Define relevant performance metrics to evaluate the RL system’s behavior. Metrics may include safety (e.g., collision rate), efficiency (e.g., average speed, fuel consumption), adherence to traffic rules, and comfort (e.g., smoothness of maneuvers).
– Compare the performance of the RL-based system against baseline methods, such as rule-based controllers or handcrafted algorithms, to demonstrate its superiority.


In conclusion, the proposed research project holds immense potential in advancing RL applications in automated driving systems by developing a sophisticated decision-making framework that prioritizes safety, efficiency, and adaptability. Through rigorous evaluation and testing, this project aims to contribute valuable insights to the field of autonomous vehicle technology, ultimately leading to safer, more efficient, and intelligent autonomous driving solutions.

 

References
[1] Laurène Claussmann, Marc Revilloud, Dominique Gruyer, and Sébastien Glaser. A review of motion planning for highway autonomous driving. IEEE Transactions on Intelligent Transportation Systems,21(5):1826–1848, 2020.


[2] Fernando Garrido and Paulo Resende. Review of decision-making and planning approaches in automated driving. IEEE Access,10:100348–100366, 2022.


[3] D. González, J. Pérez, V. Milanés, and F. Nashashibi. A review of motion planning techniques for automated vehicles. IEEE Transactions on Intelligent Transportation Systems, 17(4):1135–1145, April 2016.


[4] B Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion, Ahmad A. Al Sallab, Senthil Yogamani, and Patrick Pérez. Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(6):4909–4926, 2022.


[5] Hanna Krasowski, Yinqiang Zhang, and Matthias Althoff. Safe reinforcement learning for urban driving using invariably safe braking sets. In 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), pages 2407–2414, 2022.


[6] Shahrokh Paravarzar and Belqes Mohammad. Motion prediction on self-driving cars: A review, 2020.

 

[7] Stefano Pini, Christian S. Perone, Aayush Ahuja, Ana Sofia Rufino Ferreira, Moritz Niendorf, and Sergey Zagoruyko. Safe real-world autonomous driving by learning to predict and plan with a mixture
of experts. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 10069–10075, 2023.


[8] Akshaj Tammewar, Nikita Chaudhari, Bunny Saini, Divya Venkatesh, Ganpathiraju Dharahas, Deepali Vora, Shruti Patil, Ketan Kotecha, and Sultan Alfarhood. Improving the performance of autonomous driving through deep reinforcement learning. Sustainability, 15(18), 2023.


[9] Shijie Wang and Shangbo Wang. A novel multi-agent deep rl approach for traffic signal control, 06 2023

Main activities

Main activities:

  • research
  • write scientific papers
  • present work at scientific conferences
  • programming
  • data management and curation
  • interact with partners (scientists, engineers)
  • write a doctoral thesis
  • participate to demonstrations and showcases

Skills

Candidate Profile :

  • Master’s Degree in the relevant field
  • Strong background in machine learning, particularly reinforcement learning
  • Proficiency in programming languages such as Python, C++
  • Solid understanding of robotic systems and control challenges real-time performance and for real-world environments
  • Familiarity with automated vehicles simulation environments
  • Experience in developing control algorithms or motion planning strategies for autonomous vehicles
  • Excellent problem-solving skills and the ability to work both independently and collaboratively in an interdisciplinary team.

Languages :

  • French optional but very desirable.
  • Good level of English for communication (international team).

Additional skills:

  • Strong ability to work in groups,
  • autonomy [essential],
  • motivation,
  • strength of initiative

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours (after 6 months of employment)
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage