PhD Position F/M [Campagne doctorants] on distributed automated machine learning with application on IoT data

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

Contexte et atouts du poste

The MIMOVE team at Inria Paris undertakes research enabling next-generation mobile distributed systems, from their conception and design to their runtime support, focusing on middleware and data. MIMOVE has longstanding expertise in mobile and service-oriented computing, semantic technologies, interoperability, system emergence and evolution, and edge/fog computing. MIMOVE works on these topics through many national and international collaborations with academia and industry, including large-scale software development of real-world systems. MIMOVE’s research results impact various application domains; MIMOVE focuses in particular on the application areas of IoT and smart cities.

 

The PhD student will be an employee of Inria and will be supervised by Nikolaos Georgantas (nikolaos.georgantas@inria.fr) and Maroua Bahri (maroua.bahri@inria.fr).

Mission confiée

Automated Machine Learning (AutoML) is an approach that optimizes the machine learning process, making it more accessible and efficient for users with varying levels of expertise [1]. AutoML leverages algorithms and computational capabilities to automate key aspects of the machine learning pipeline, such as feature engineering, model selection, and hyperparameter tuning. This enables individuals with limited machine learning expertise to build and deploy impactful models. However, autoML is still in its infancy stage and struggles to keep pace with the growing volumes of heterogeneous IoT data in continuous contexts. Moreover, existing automated solutions are primarily designed for batch setting and are centralized, making them unsuitable for handling continuous IoT data streams at scale.

 

The objective of this PhD thesis is to investigate the autoML problems and enhance its adaptability and efficiency in distributed environments, particularly with IoT data. This will involve researching and integrating algorithms into the autoML pipeline, with a focus on optimizing hyperparameters for real-time IoT data [2][3] in distributed settings. Additionally, two distributed aspects will be examined to improve resource utilization; (i) distributing autoML processing tasks to manage the computational complexity of the autoML challenges on IoT data streams [4], and (ii) processing heterogeneous distributed IoT data in edge computing environments with nodes of varying capabilities, where data are processed near their sources to reduce communication, networks delay, network bandwidth, and even enforce data privacy [5].

 

References:

[1] Hutter, F, Kotthoff, L, & Vanschoren, J. Automated machine learning: methods, systems, challenges. Springer Nature, 2019.

[2] Kulbach, C, Montiel, J, Bahri, M, Heyden, M, & Bifet, A. "Evolution-Based Online Automated Machine Learning." Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham: Springer International Publishing, 2022.

[3] Carnein, M, Trautmann, H, Bifet, A, & Pfahringer, B. "confstream: Automated algorithm selection and configuration of stream clustering algorithms." Learning and Intelligent Optimization: 14th International Conference, LION, 2020.

[4] A. Abd Elrahman, M. El Helw, R. Elshawi and S. Sakr, "D-SmartML: A Distributed Automated Machine Learning Framework," IEEE 40th International Conference on Distributed Computing Systems (ICDCS), 2020.

[5] Preuveneers D. “AutoFL: Towards AutoML in a Federated Learning Context”. Applied Sciences. 2023.

Principales activités

The PhD student will conduct original research on the topic described above. The expected activities include, but are not limited to:

  • Bibliographical study on autoML, distributed computing, edge analytics
  • Formulation of autoML for data streams in a distributed context
  • Development of distributed autoML systems
  • Assessment of the novel proposed approach(es)
  • Scientific publications and presentation of results at conferences

Compétences

  • Sound knowledge of machine learning/distributed systems/optimization concepts
  • Software development skills: Python and Java
  • Relational skills: team worker (verbal communication, active listening, motivation and commitment)
  • Good level of spoken and written English

Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage

Rémunération

Monthly gross salary : 2100 € during the first and second years. 2190 € the last year.