PhD Position F/M Modeling and data curation in robot policy learning

Contract type : Fixed-term contract

Level of qualifications required : Graduate degree or equivalent

Fonction : PhD Position

Level of experience : Recently graduated

Context

The Phd will be done at Inria in the Willow research team. 

Assignment

This PhD addressses the challenge of data in robotic policy learning. The focus is on two key research directions: 1. understanding the structure and utility of existing robotic datasets, and 2. scaling robot learning via human demonstration videos.

The first line of research involves a comprehensive analysis of robotics datasets to identify which samples contribute the most significantly to policy performance and which are redundant. This involves techniques such as influence functions and diversity metrics to support data curation strategies that enhance learning efficiency. Beyond per-sample importance, this work will investigate which types of data features (semantic labels, 2D vs. 3D structure, or specific modalities like touch, language, or speech) most effectively support generalizable policy learning. In turn, this analysis aims to guide the construction of more comprehensive datasets for robot learning, spanning different embodiments and skill domains.

The second line of work aims to unlock large-scale learning from human task demonstrations. Rather than relying on manual data collection or simulation, this research will develop methods to extract structured, task-relevant information (end-effector state changes, contact dynamics, or long-term intent) directly from real-world videos of humans performing tasks. Key challenges such as embodiment mismatch, contact ambiguity, and occlusions will be addressed through morphology-aware embeddings, and policy distillation strategies that adapt human-derived policies to robotic embodiments via reinforcement learning, or embodiment-agnostic models. The research will further investigate how to leverage human error examples to teach robots what not to do, and how to extract rich physical attributes from video using motion cues and learned dynamics priors.

Together, these efforts seek to lay the groundwork for data-efficient, scalable learning pipelines, paving the way for generalist robot policies. This research aims to integrate deep insights from datasets with supervisory signals from human behavior for generalist robotic policy learning, bridging the gap between data availability and robotic capability.

Main activities

Main activities:

 

  • Analyse and implement related work. 
  • Design novel innovative solutions. 
  • Write progress reports and papers. 
  • Present work at conferences.

 

 

Skills

Technical skills and level required : programming skills are required. 

Languages : English and possibly French.

Relational skills : Good communication skills. 

 

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage