PhD Position F/M Foundation Models and Natural Language Communication for Human-Robot Collaboration

Contract type : Fixed-term contract

Level of qualifications required : Graduate degree or equivalent

Fonction : PhD Position

Context

The HUCEBOT team is a new team of the Center Inria at the University of Lorraine.

The team is dedicated to advancing algorithms for human-centered robots: robots that are not working autonomously in isolation, but that instead react, interact, collaborate, and assist humans. To do so, these robots need to intertwine a multi-contact whole-body controller, a digital simulation of the interacting humans, and machine learning models to predict and respond to human movements and intentions. In a crescendo of complexity, the team will tackle scenarios that involve collaboration with cobots, assistance with exoskeletons, and whole-body teleoperation of humanoid robots. The application domains span from industrial robotics to space teleoperation.

The main robots of the team are the Tiago++ bimanual mobile manipulator, the Unitree G1 humanoid, and the Talos humanoid robot. The team also works with Franka cobots and exoskeletons.

The team currently consists of about 25 members, including permanent researchers, PhD students and post-doctoral students.

Serena Ivaldi, head of HUCEBOT, is holding the chair in Robotics and AI of the Cluster IA ENACT project (https://cluster-ia-enact.ai/) that is funding this PhD thesis. In the chair, she wants to push the research in Natural Language to assist humans in different scenarios of collaboration with robots, where safety is paramount.

Assignment

Most work on LLMs for robotics focused on generating sequences of actions and plans from high level goals, offline, only targeting autonomous robots isolated from humans. A critical limitation to deploy LLMs for robots collaborating with humans is their ability to be used online, in a human-in-the-loop scenario, to generate suitable motions and "safe" robot policies.

Here, we use LLMs to generate a robot's motions online in collaborative scenarios where safety is critical: active exoskeletons and mobile manipulators assisting humans in object manipulation. The human vocally commands the robot interactively, online, to control the generation of its motion at the low level: start, stop, direct, and change its low-level parametrization (e.g., compliant behavior, the velocity, the maximal torque assistance, etc.).

The first objective is to design the robot's controller with the natural language interaction feature in mind: the human's commands, corrections and Approximate Numerical Expressions must be translated into meaningful quantities, coherent with the physics of the problem. What do "faster", "a bit higher", "little to the right", and "more assistance" mean?

The second objective is to design new multimodal models fusing LLMs and visual pipelines to predict the human's intent and minimize the need for corrections. Natural language instructions may be incomplete or unclear, but cameras could provide sufficient contextual information to generate an appropriate motion. For example, "take that" could be easily translated into "grasp the bottle", if it is the only item in front of the robot.

The third objective is to detect emergency commands, leveraging both LLMs and audio processing models for nonverbal communication, and generating suitable robot's reactive behaviors. Humans are often unable to speak clearly when they interact with a robot: sometimes, fear takes over and they do not speak at all, or they mumble, or scream, when they could just say a clear "stop". Detecting emergency commands is critical to be able to deploy the robots into the real world.

The PhD student will carry out research in the aforementioned objectives, and will benefit from our collaboration with E. Zibetti (Paris 8, SHS), expert in Approximate Numerical Expressions for Psychology, and D. Sadigh (Stanford University), leading the research in LLMs for robot actions.

Real-world demonstrations with real robots and real humans interacting with the robots are mandatory in this PhD.

Main activities

Main activities: implement, test and develop novel algorithms for robots that use language models and foundation models. Write papers and present them at conferences. Write, test, validate and document its associated software. Experiments with real robots are mandatory.

The PhD will also be involved in the activities organized by the Cluster-AI project ENACT, which may involve dissemination actions, meetings and presentations to relevant stakeholders (Europe, France, industries, etc).

He/she will also participate with the HUCEBOT team to robotics competitions and hackathons organized by the European project euROBIN, with demonstrations of the robots' skills at the European Parliament in 2026 and at ICRA 2026.

Skills

Good skills in Python (Pytorch). Ideally, prior experience with LLM, VLM and Foundation Models.

Good understanding of robotics.

Languages: English (English is the official language of the team and many members do no speak French).

Proactivity and curiosity, ability to work in a team are fundamental.

Benefits package

Subsidized meals
Partial reimbursement of public transport costs
Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training
Social security coverage

Remuneration

2200€ gross/month

Apply for this position

General Information

Theme/Domain : Robotics and Smart environments
Software engineering (BAP E)
Town/city : Villers lès Nancy
Inria Center : Centre Inria de l'Université de Lorraine
Starting date : 2025-10-01
Duration of contract : 3 years
Deadline to apply : 2025-07-31

Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.

Instruction to apply

Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.

Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.

Contacts

Inria Team : LARSEN
PhD Supervisor :
Ivaldi Serena / serena.ivaldi@inria.fr

The keys to success

The ideal candidate is fascinated by the recent developments in artificial intelligence and robotics, especially Foundation Models, LLM, VLM, OpenVLA. He/She wants to experiment with these new techniques, develop their skills, and experiment with state-of-the-art robots.

About Inria

Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.