2020-02493 - 3D Human Pose Estimation from a Single Image with Deep Learning
Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Ingénieur scientifique contractuel

A propos du centre ou de la direction fonctionnelle

The Inria Rennes - Bretagne Atlantique Centre is one of Inria's eight centres and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc.


Contexte et atouts du poste

The engineer will work closely with Dr. Adnane Boukhayma and Prof. Franck Multon. The work will be conducted at Inria Rennes in the MimeTIC research team. This position takes part in the KIMEA Cloud project, a collaboration between Inria Rennes and start-ups Moovency and Quortex. The goal of this project is to asses the risk of musculoskeletal disorders from a smartphone. The manufacturing industry is the sector most affected by musculoskeletal disorders, in particular due to repetitive gestures and frequent load transport. These companies do not necessarily have internal ergonomics resources and cannot always invest in technological tools. Given simply a video of the worker in his workstation, a Deep Learning based algorithm will estimate the 3D positions of the person’s joints. The musculoskeletal risks will be subsequently analyzed automatically from these 3D postures. The role of Inria in this project is to research and develop a robust solution for 3D human pose estimation from color images in the wild, particularly in the industrial context.

Mission confiée

3D human pose estimation is one of the fundamental problems and most active research areas in computer vision with various applications in many fields such as action recognition, human-machine interfaces, special effects and telepresence. Despite recent advances in the scientific community, monocular 3D human pose estimation in natural images remains far from being resolved.

The recent surge of Deep Learning allowed a substantial improvement in the performance of state-of-the-art methods on 2D and 3D human pose estimation. In particular, a family of 3D pose estimators cast the problem as lifting from 2D to 3D predictions (e.g. [1,2,3,4]). They generally outperform the end-to-end counterparts since they benefit from the remarkable current performances of 2D pose estimators, and due in part to the lack of massive training image data with ground-truth 3D pose annotations. We propose to follow this direction at first, reproduce state-of-the-art results and explore further improvements and new approaches to allow in particular better generalization to natural images and challenging capture conditions, reducing dependencies to 2D predictions, and using incremental learning to update the learned models with new learning examples on the fly.

Within this role, the engineer will lead the development of a deep learning based method for 3D human pose estimation from a single color image. He/she could also participate in the research part of the project. The results of these works are expected be published in top tier computer vision conferences such as CVPR, ICCV, ECCV, etc.

We propose the following course of action:

  • 2D to 3D pose estimation lifting:
    Developing a Deep Learning method allowing to obtain 3D poses from 2D poses. This task notably involves generating a simulated 2D/3D learning set from 3D motion capture. The challenges are to be able to manage erroneous 2D skeletons in the event of large occlusions, and the multitude of possible 3D points of view.
  • Combining end-to-end 3D pose estimation and 2D-3D lifting:
    Developing a Deep Learning method for 3D human pose estimation that can learn simultaneously from image/3D, image/2D and 2D/3D annotation pairs. Test cases include industrial postures and environments, as well as severe capture conditions.
  • Incremental learning:
    Developing a method that allows the learning models to adapt in an incremental way to new learning data without forgetting their existing knowledge. The objective is to avoid relaunching a total learning of the Deep Learning network with each new example that we would like to add.

[1] Multi-person 2d and 3d pose detection in natural images. TPAMI, 2019.
[2] 3d human pose estimation = 2d pose estimation + matching. CVPR, 2017.
[3] A simple yet effective baseline for 3d human pose estimation. ICCV, 2017.
[4] 3d human pose estimation in the wild by adversarial learning. CVPR, 2018.

Principales activités

The engineer will be tasked with:

  • Developing a program allowing 3D human pose estimation from single color images in the wild. The solution will be tested on industrial use cases with possible occlusions and extreme capture situations.
  • Depending on the progress of the project, developing an incremental learning solution allowing the aforementioned 3D pose estimation model to learn from new example cases without the need to retrain on all data, and without any loss in the models performance.

In practice, these tasks imply:

  • Participating in the research discussions and algorithms design.
  • Reading and implementing research papers.
  • Reproducing state-of-the-art results.
  • Implementing the ideas proposed by the research collaborators.
  • Creating training and testing datasets.
  • Participating in the publication of the research results.


  • Candidates should preferably have a MSc or PhD in computer science, applied mathematics, computer vision, computer graphics or machine learning.
  • The ability to read, understand and implement research papers and reproduce scientific results.
  • Good coding skills (Python, C, C++).
  • Proficiency in deep learning frameworks such as Pytorch is a plus.


  • Subsidized meals
  • Partial reimbursement of public transport costs


Monthly gross salary from 2562 euros according to diploma and experience