Computer Vision: PhD thesis on Simulatable Physics-aware World Models

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Ingénieur scientifique contractuel

Niveau d'expérience souhaité : Jeune diplômé

Contexte et atouts du poste

A phd position is open in Astra-vision, the computer vision group of ASTRA, a newly-created joint Valeo+Inria research team on autonomous and safe driving. The research will take place in the context of a collaboration between valeo.ai, an international team conducting AI research for Valeo automotive applications, in the Astra-vision group that focuses on Vision and 3D Perception for Scene Understanding. The candidate will be located in the Inria office, in central Paris.

Mission confiée

The aim of the PhD thesis is to develop physics-aware world models. To do so, instead of only optimizing a latent state of the world, we propose to develop models which jointly optimize a simulatable synthetic representation of the world.
Instead of seeking exhaustive digital twins, we will build on recent findings showing that coarse synthetic scenes (e.g., made of primitives) can serve as powerful support basis to learn the dynamics of the world even transferable to complex tasks like video generation [RBL+25] or planning [FGZ+26]. Similarly, we expect that simple synthetic dynamics (e.g., collision, gravity, etc.) can be used to steer pretrained models towards physical awareness. Preliminary literature [LGL+25] also suggests that using synthetic data not only enables a more nuanced representation of the physical world, but also allows for explicit modeling of physically relevant properties within images and videos. This is crucial as having access to such explicit physical representations of a scene enables direct interventions on the world representations, depending on explicit physics.

 

The detail of the PhD thesis may be found here: https://astra-vision.github.io/jobs/ 

Principales activités

The precise outline of the work will be refined with the candidate.

Compétences

Required skills:

  • Great knowledge of Computer Vision and Deep Learning
  • Knowledge of the main deep vision architectures
  • Ability to read and analyse a scientific article
  • Experience of some of the main deep frameworks
  • Great coding ability
  • Prior scientific experience is a plus
  • The intern must be fluent in english

Please also ensure that current diplomacy policies allow your venue on the french territory given the current special sanitary conditions.

Avantages

  • Subsidised catering service
  • Partially-reimbursed public transport
  • Flexible working hours
  • Sports facilities