2022-04901 - Real time speaker verification in real conditions for a mobile robot
Le descriptif de l’offre ci-dessous est en Anglais

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Ingénieur scientifique contractuel

Contexte et atouts du poste

This Engineer position fits within the scope of the ANR project “ROBOVOX” which involves the Multispeech team from Inria Nancy - Grand Est (https://team.inria.fr/multispeech/), the speech processing team from Laboratoire d'informatique d'Avignon (http://lia.univ-avignon.fr/), and A.I. Mergence (http://www.aimergence.com/fr/).

Mission confiée

Speaker identification has recently been deployed in several real-world applications including secured access to bank services via telephone or internet. However, identification based solely on voice remains a modality with a limited reliability under real conditions including several acoustic perturbations (noise, reverberation...), which are getting important when distant speech recording is used. Recent works indicate that multichannel speech enhancement of the test signal results in improved performance for speaker identification systems in noisy environments, especially as it enables controlling the distortion introduced on the speech signal. Additionally, the usage of deep learning for multichannel speech enhancement has recently allowed for a large performance improvement.


In Robovox, we are investigating several approaches for robust speaker verification. One of the approaches focuses on applying a multichannel speech enhancement processing before the core speaker verification step. Also, in order to evaluate performance on real condition speech data, a dedicated speech corpus is under recording using a mobile robot.

Principales activités

The main activities will concern the optimization of the implementation and its adaptation for real-time operation in real conditions.



The main point will be to optimize the code and the models to be compliant with real-time processing using the resources available on the robot. In addition, as the current approaches under study were developed for batch evaluations, it will be needed to reorganize some processing to match with real-time operation, and provide the result with a limited latency, whatever the duration of the incoming speech signal is.

Evaluations will be conducted using the corpus recorded in the ROBOVOX project which corresponds to the target application conditions



  • MSc in computer science, machine learning, signal processing
  • Experience with programming language Python
  • Experience with deep learning toolkits is a plus, as well as experience with real-time processing


  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage