Level of qualifications required : Graduate degree or equivalent
Fonction : PhD Position
This PhD fits within the scope of the ANR project "ROBOVOX" involving the Multispeech team at Inria Nancy - Grand Est (https://team.inria.fr/multispeech/), the speech processing team at Laboratoire d'informatique d'Avignon (http://lia.univ-avignon.fr/), and A.I. Mergence (http://www.ai-mergence.com/fr/).
Speaker identification has recently been deployed in several real-world application including secured access to bank services via telephone or internet. However, identification based solely on voice remains a modality with a limited reliability under real conditions including several acoustic perturbations (noise, reverberation...). Recent works indicate that multichannel speech enhancement of the test signal results in improved performance for speaker identification systems in noisy environments , especially as it enables controlling the distortion introduced on the speech signal . Additionally, the usage of deep learning  for multichannel speech enhancement has recently allowed for a large performance improvement [4, 5].
 D. Ribas, E. Vincent, J. R. Calvo, “Full multicondition training for robust i-vector based speaker recognition”, In Proc. Interspeech, 2015.
 R. Serizel, M. Moonen, B. Van Dijk and J. Wouters, “Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants”. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2014, vol. 22, pp. 785–799.
 L. Deng and D. Yu, Deep Learning: Methods and Applications, NOW Publishers, 2014.
 J. Heymann, L. Drude, and R. Haeb-Umbach, “Neural network based spectral mask estimation for acoustic beamforming”. In Proc. ICASSP, 2016.
 Nugraha, A. A., Liutkus, A. and Vincent, E. "Multichannel audio source separation with deep neural networks", IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2016, vol. 24, n. 9, pp. 1652–1664.
The goal of this PhD thesis is to explore the usage of deep learning based speech enhancement techniques to improveme the performance of speaker identification systems in real conditions. In a first step, we propose to develop algorithms to process both noise and reverberation simultaneously inspired by recent works in the dereverberation domain . The final goal is to propose end-to-end approaches that perform speaker identification directly from multichannel perturbed signal. We propose to explore methods that compare several recordings from the same speaker captured under different acoustic conditions in order to learn intermediate representations that are robust to these perturbations [7, 8, 9].
 O. Schwartz, S. Gannot and E. A. Habets, “Multi-microphone speech dereverberation and noise reduction using relative early transfer functions.” IEEE/ACM Transactions on Audio, Speech and Language Processing, 2015, vol. 23, n. 2, pp. 240-251.
 H. Bredin. "Tristounet: triplet loss for speaker turn embedding". In Proc. ICASSP, 2015.
 G. Andrew, R. Arora, J. Bilmes, and K. Livescu. "Deep canonical correlation analysis". In Proc. ICML, 2013.
 S. Sun, S. "A survey of multi-view machine learning". Neural Computing and Applications, 2013, vol. 23, n. 7-8, pp 2031–2038.
MSc in computer science, machine learning, signal processing
Experience with programming language Python
Experience with deep learning toolkits is a plus
- Subsidised catering service
- Partially-reimbursed public transport
- Social security
- Paid leave
- Flexible working hours
- Sports facilities
Salary: 1982€ gross/month for 1st and 2nd year. 2085€ gross/month for 3rd year.
Monthly salary after taxes : around 1594,00€ for 1st and 2nd year. 1677,00€ for 3rd year. (medical insurance included).
- Town/city : Villers-lès-Nancy
- Inria Center : CRI Nancy - Grand Est
- Starting date : 2019-09-01
- Duration of contract : 3 years
- Deadline to apply : 2019-03-31
- Inria Team : MULTISPEECH
PhD Supervisor :
Serizel Romain / email@example.com
Inria, the French national research institute for the digital sciences, promotes scientific excellence and technology transfer to maximise its impact. It employs 2,400 people. Its 200 agile project teams, generally with academic partners, involve more than 3,000 scientists in meeting the challenges of computer science and mathematics, often at the interface of other disciplines. Inria works with many companies and has assisted in the creation of over 160 startups. It strives to meet the challenges of the digital transformation of science, society and the economy.
Instruction to apply
Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.
Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.
Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.