Research internship on deep learning methods for mapping species interactions

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : Stage

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Stagiaire de la recherche

Niveau d'expérience souhaité : Jeune diplômé

Contexte et atouts du poste

This internship (with the possibility of a PhD thesis afterwards) is part of the PEPR Agroécologie & Numérique project (EcoControl), which is being developed by nine French institutions. EcoControl aims to improve our understanding of arthropod regulatory services and identify agroecological levers to enhance natural pest regulation in agriculture at both local and territorial levels, in continental France, Corsica and Guadeloupe. To achieve this goal, we will combine fieldwork with innovative conceptual and numerical approaches. We will address the following classical though not yet answered question: How do biotic and abiotic factors, whether phylogenetic, environmental, related to farming practices or to the introduction of alien species, influence the structure and dynamics of interaction networks between plants (cultivated or not), pests  indigenous or introduced), and their natural enemies (predators /parasitoids)?

In this context where we need to jointly predict many species share responses to the environment, since they share a common niche, and have dependencies through biotic interactions, multi-species species distribution models (SMDs) can be relevant. In this project we will build upon recent work on deep learning-based SDMs (Ryckewaert et al., 2024), as they have they can automatically learn joint environmental features that predict well all the species.

Mission confiée

The specific objective of this internship is to be decided togetehr with the candidate. It should cover some of the elements of interest to the PhD project that can follow the intership, with the final objective of developing and deploying the necessary methods to map, at a scale ranging from region to country, the likelihood of ecological interactions relevant to the spread and regulation of agricultural pests.

This will require :

1) creating a large-scale dataset of environmental variables (e.g. OSO, Corine Land Cover, SAFRAN, Worldclim), as well as variables stemming from remote sensing (e.g. Landsat, Sentinel), that can be related to species responses (e.g. Picek et al., 2024) ;

2) training SDMs for the species of interest based these variables and species observations and/or surveys (starting from the work of Ryckewaert et al., 2024). To make the most out of the opportunistic and standardized data available, we aim at extending Integrated SDM principles (Isaac et al., 2020) to deep learning SDMs; 

3) use information about ecological networks (from other partners in the EcoControl consortium) in order to understand which locations are compatible with several ecological interactions of interest, mostly related to plant pests;

4) a reflection based on explainable AI tools will be considered in order to help ecological modelers better understand complex model behavior at different scales. Special attention will also be paid to ensuring the developed methods allow for a probabilistic interpretation of the predictions to make them more useful for downstream tasks within the consortium.

 

Picek et al. "GeoPlant: Spatial Plant Species Prediction Dataset." NeurIPS Datasets and Benchmarks Track. (2024)

Ryckewaert et al. "Applying the maximum entropy principle to  multi-species neural networks improves species distribution models." arXiv preprint arXiv:2412.19217 (2024).

Isaac, N. J., Jarzyna, M. A., Keil, P., Dambly, L. I., Boersch-Supan, P. H., Browning, E., ... & O’Hara, R. B. Data integration for large-scale models of species distributions. Trends in ecology & evolution, 35(1), 56-67. (2020)

Principales activités

The selected candidate is expected to perform some of the following:

  • Become familiar with the state-of-the-art on species distribution modelling.
  • Construct a dataset with the necessary input variables and reference data, including opportunistic species observations and  systematic biodiversity surveys.
  • Develop the necessary multi-modal deep learning methods to profit from the different modalities of input and reference data.
  • Write papers describing this work aiming at top computer science and ecology venues.

Compétences

We are seeking a candidate that is strongly motivated to improve our understanding of the interaction between biodiversity and agriculture.

Top candidates would also have a strong command of:

  • Python programming
  • Deep learning frameworks (preferably Pytorch)
  • Use of Linux GPU severs via command line
  • Written scientific English

It would be a plus to have familiarity with:

  • GIS and remote sensing
  • Point process models and/or SDMs

Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage