Type de contrat : CDD
Contrat renouvelable : Oui
Niveau de diplôme exigé : Thèse ou équivalent
Fonction : Post-Doctorant
Niveau d'expérience souhaité : Jeune diplômé
Contexte et atouts du poste
The job position is in central Paris, France. As it is part of a joint team project, the candidate will be colocated on Inria and Valeo.ai sites.
At Inria the candidate will work in RITS team that has approx. 20 people working on computer vision, decision,planning, modeling for autonomous driving. The computer vision group publishes in top-tier scientific venues of computer vision and machine learning.
Valeo.ai is the artificial intelligence research center for automotive applications of Valeo Group. The center started in 2017 to conduct ambitious research projects, regarding assisted and autonomous driving. The Valeo.ai group has about 25 people, all publishing in top-tier scientific venues in artificial intelligence.
Work environments are nice and lively with people from worldwide origins. Social skills is appreciated as collaborations with other researchers/PhDs is expected.
Mission confiée
3D Transformers for Autonomous Driving
Principales activités
In the same way that learning with transformers has somehow brought a revolution in natural language processing [Vas17], and then in image processing [Car20, Dos21], recently emerging transformers [Guo21, Zha21] for point clouds are now also starting to significantly boost the performance of 3D processing. Yet, the general concepts of transformers and the reasons for their success are not fully understood, and there are still many open ways to adapt them efficiently to 3D.
The emergence of transformers opens a wide field of research subjects that matches well with the research axis “Vision and 3D Perception for Scene Understanding”. The goal of this postdoc is to attack some of them.
- A first objective is to investigate the specificity of transformers for 3D. It includes the question of positional encoding, which is widely different in 3D from the 1D (text) and 2D (image) perspectives. It also includes the square complexity issues with cross- and self-attention, taking into account the sparsity of point clouds. More generally, it concerns as well the time and space complexity of point cloud processing, in the perspective of real-time, embedded software.
This study of 3D transformers will be made through downstream tasks that include object detection, semantic or panoptic segmentation, and denser depth predictions, which are key tasks for autonomous driving. We will study in particular the impact of the specific sparsity and data patterns induced by vehicle sensors. We will also consider a stream of point clouds, as available from a lidar, taking time into account in a 4D perspective. - Besides, we will investigate the use of transformers backbones regarding self-supervision [Sim21], which is a key approach to bring supervised learning to a new level by saving the cost of annotating large datasets. We will study new pretext tasks in 3D that transformers more specifically leverage, as well as contrastive learning techniques that are tightly linked to the attention mechanism of transformers.
A PhD student is about to start on self-supervision for 3D at Valeo and another works on 2D supervision for 3D at Inria, though none of them focusing on transformers. The postdoc will be able to work jointly with either PhD student on self-supervision issues. - Moreover, we will study the transferability of learned transformer models in the perspective of domain adaptation [Vu19a, Vu19b]. In particular, we will investigate the disentangling of latent space representations, working towards domain-invariant features by enforcing orthogonality of the domain features while enabling the discovery of exclusive task or domain features, through their realization via multi-head attention.
Another PhD student in Valeo is about to start working on 3D domain adaptation, although with a different perspective (using optimal transport). There will nonetheless be a number of collaboration opportunities between this PhD student and the postdoc regarding adapting transformer-based features. - A last research direction concerns multi-modality, when lidar point clouds are acquired together with camera images, to leverage the similarity and complementarity of sensor information [Jar20]. One technical subject concerns a possible interplay between the forms of transformer attention used in 2D and the kinds of attention that are and will be developed in 3D. Another more general question is the joint self-supervision from the interaction of 2D and 3D, or from cross-task representations. Last, we will study the intertwined relation of geometry and semantics through the semantic scene completion task [Rol20, Rol21].
[Car20] End-to-End Object Detection with Transformers. Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko. European Conference on Computer Vision (ECCV), 2020.
[Dos21] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. International Conference on Learning Representations (ICLR), 2021.
[Guo21] PCT: Point cloud transformer. Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu. Computational Visual Media, 2021.
[Jar20] xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation. Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, Émilie Wirbel, Patrick Pérez. Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[Rol20] LMSCNet: Lightweight Multiscale 3D Semantic Completion. Luis Roldao, Raoul de Charette, Anne Verroust-Blondet. International Conference on 3D Vision (3DV), 2020
[Rol21] 3D Semantic Scene Completion: a Survey. Luis Roldao, Raoul de Charette, Anne Verroust-Blondet. International Journal of Computer Vision (IJCV), 2021.
[Sim21] Localizing Objects with Self-Supervised Transformers and no Labels. Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet, Jean Ponce. British Machine Vision Conference (BMVC), 2021.
[Vas17] Attention Is All You Need. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Conference on Neural Information Processing Systems (NeurIPS), 2017.
[Vu19a] ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Pérez. Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[Vu19b] DADA: Depth-aware Domain Adaptation in Semantic Segmentation. Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Pérez. International Conference on Computer Vision (ICCV), 2019.
[Zha21] Point Transformer. Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun. International Conference on Computer Vision (ICCV), 2021.
Compétences
Applicants should have defended or be finishing their PhD and have a strong publications record. They should have a solid background in computer vision (including 3D processing) and machine learning, particularly in deep learning, with strong PyTorch coding skills.
Applicants should apply on this platform:
• a cover letter explaining their interest and adequacy for the postdoc topic,
• their CV/resume,
• possibly, references or recommendation letters.
Application deadline: June 23rd 2022. Starting date October 1st 2022 or before (firm)
Avantages
- Subsidised catering service
- Partially-reimbursed public transport
- Flexible working hours
- Sports facilities
Partager
Informations générales
- Thème/Domaine :
Robotique et environnements intelligents
Calcul Scientifique (BAP E) - Ville : Paris
- Centre Inria : CRI de Paris
- Date de prise de fonction souhaitée : 2022-09-01
- Durée de contrat : 2 ans
- Date limite pour postuler : 2022-06-23
Contacts
- Equipe Inria : RITS
-
Recruteur :
De Charette Raoul / raoul.de-charette@inria.fr
A propos d'Inria
Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 200 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3500 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 180 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.
Consignes pour postuler
Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.
Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.
Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.