2022-05264 - CDD 12 months - Research Engineer - Linguistically inspired language models for closely related languages
Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Ingénieur scientifique contractuel

Mission confiée

The research engineer will investigate the conditions under which character- and/or phoneme-level Transformer-based language models trained on multiple closely related languages can perform well on languages or language varieties that are underrepresented in the training corpus. In particular, they will study the impact of the inclusion of wordform-level information (from wordform- or token-like-boundaries to morphological features), which will require to design novel ways to incorporate such information in character- and/or phoneme-level language models. Experiments will be carried out, at least in initial experiments, on languages of the Hindi belt.

Principales activités

Under the supervsion of two permanent researchers in the ALMAnaCH project-team, the recruited person will be involved in the following activities:

  • defintion of the research questions
  • study of the state of the art in language modelling for low-resource languages
  • study of the state of the art in language modelling for closely related languages
  • design and execution of scientific experiments
  • presentation of the work to team members
  • reporting of the experiments in the form of research publication(s)



Technical skills and level required: Knowledge of and experience in natural language processing. Experience with language model development and/or NLP for low-resource languages is a plus.

Languages: English (oral and written), French is a plus, a language of the Hindi belt is a plus

Relational skills: Excellent communication skills (oral and written)

Software experience: Python, standard NLP toolkits


  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours (after 12 months of employment) 
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities