NLP Engineer : development of an embodied conversational agent

Contract type : Fixed-term contract

Level of qualifications required : Graduate degree or equivalent

Fonction : Temporary scientific engineer

Context

This position is part of the "Son-of-Sara" project (continuation of the "Sara" project from Articulab, member of the ALMAnaCH project-group at Inria Paris - see here for more details: <https://articulab.hcii.cs.cmu.edu/projects/sara/>), which aims to develop a new kind of LLM-based embodied conversational agent (embodied chatbot), comprising Natural Language Processing (NLP) modules that understand and create language, and a Unity virtual agent module that adds nonverbal behaviors of the face and body to the language, leading to an embodied chatbot capable of interacting in a natural way with a human user. In the context of our project, this means using machine learning models to process, analyze and generate multimodal information (text, audio and video body behaviors) in real time that is then realized by the human body animation. The system will be equipped with a microphone and a camera used to perceive the user (voice, gestures, facial expressions, etc.), it will process and analyze these data to extract precise information, and then generate a vocally, verbally and visually adapted response via its agent (voice, gestures, facial expressions, etc.).

Assignment

Within this project, the engineer will focus on the development of the agent's Turn-taking capacities : A fundamental component of a dialogue system is the ability to speak, and to let the interlocutor speak, at the right moments. This ability is called turn-taking. Indeed, the realistic nature of dialogue depends on the fluidity of turn transitions between interlocutors, and therefore on the system's turn-taking performance. The engineer's mission is to integrate a turn-taking module into the current system, based on a predictive deep learning model which, from textual or audio data, predicts when will the user end his turn and stop speaking (and therefore, when the agent can start speaking).

Main activities

  • Bibliographic research on the state of the art.
  • Development of the module.
  • Integration of the module in the current system (with the help of Marius).

Skills

  • Python, deep learning and NLP libraries (Hugging Face, Transformers, scikit-learn, etc).
  • Experience with training and evaluating deep learning and NLP models.
  • Experience with the dialogue domain (oral interactions, audio data).
  • Language : Fluent English speakers with a French level of at least B1, or fluent French speakers with an English level of at least B1 are both invited to apply

(This list of skills is provided as a guide only. We encourage you to apply, even if you have only most of them).

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage