2022-05117 - Post-Doctoral Research Visit F/M Reinforcement Learning with Applications to Active Learning and Recommender Systems

Contract type : Fixed-term contract

Renewable contract : Oui

Level of qualifications required : PhD or equivalent

Fonction : Post-Doctoral Research Visit

About the research centre or Inria department

The Inria Sophia Antipolis - Méditerranée center counts 34 research teams as well as 7 support departments. The center's staff (about 500 people including 320 Inria employees) is made up of scientists of different nationalities (250 foreigners of 50 nationalities), engineers, technicians and administrative staff. 1/3 of the staff are civil servants, the others are contractual agents. The majority of the center’s research teams are located in Sophia Antipolis and Nice in the Alpes-Maritimes. Four teams are based in Montpellier and two teams are hosted in Bologna in Italy and Athens. The Center is a founding member of Université Côte d'Azur and partner of the I-site MUSE supported by the University of Montpellier.


Postdoc position will be with Inria NEO team:





Topic description:

Recently we have witnessed tremendous success of Deep Reinforcement Learning algorithms, specifically Deep Q-Network (DQN) algorithm [1], in various application domains. Just to name a few examples, DRL has achieved superhuman performance in playing Go, Chess and many Atari video games. We would also like to mention the
impressive progress of DRL applications in robotics [2], telecommunications [3] and medicine [4].

However, as was recently pointed out in [5], the original DQN scheme lacks convergence guarantees. In [5] a new Deep Reinforcement Learning scheme, FG-DQN, was proposed, which not only has sound theoretical convergences guarantees but also have shown superior performance on some benchmark environments.

The goal of this postdoc project is to further study and apply the FG-DQN scheme. In particular, as application areas we aim at active learning and recommender systems.


[1] Mnih, V., et. al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

[2] Gu, S., et. al. (2017). Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In 2017 IEEE international conference on robotics and automation (ICRA) (pp. 3389-3396).

[3] Luong, N. C., Hoang, D. T., Gong, S., Niyato, D., Wang, P., Liang, Y. C., & Kim, D. I. (2019). Applications of deep reinforcement learning in communications and networking: A survey. IEEE Communications Surveys & Tutorials, 21(4), 3133-3174.

[4] Jonsson, A. (2019). Deep reinforcement learning in medicine. Kidney diseases, 5(1), 18-22.

[5] Avrachenkov, K. E., et. al. (2021). Full gradient DQN reinforcement learning: A provably convergent scheme. In Springer Modern Trends in Controlled Stochastic Processes: (pp. 192-220).



Technical skills required:

Good knowledge of Markov Decision Processes or/and Reinforcement Learning, Python


Good level of English, Conversational French is an advantage but not strictly required


Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage


Gross Salary: 2653 € per month