2022-05326 - Post-Doctoral Research Visit F/M Software Heritage: large-scale empirical analysis of open source code artifacts

Contract type : Fixed-term contract

Renewable contract : Oui

Level of qualifications required : PhD or equivalent

Fonction : Post-Doctoral Research Visit


This postdoc position is open at Inria in the context of the Software Heritage project, on the topic of
large-scale empirical analysis of open source code artifacts, such as source code files and commits, as captured by state-of-the art distributed version control systems (VCSs).

Inria is a national research institute dedicated to digital sciences that promotes scientific excellence and transfer. Inria employs 2,400 collaborators organised in research project teams, usually in collaboration with its academic partners. This agility allows its scientists, from the best universities in the world, to meet the challenges of computer science and mathematics, either through multidisciplinarity or with industrial partners.

Software Heritage is a unique initative to build the universal archive of software source code, catering for the needs of research, industry and society as a whole.


Assignments :
With the help of the Software Heritage team, the recruited person will work on the analysis of the Software Heritage graph dataset (see the article online at https://dl.acm.org/citation.cfm?id=3341907), the largest publicly available corpus of Free and Open Source Software (FOSS) development history.


Main activities

To this end, the recruited person will:

  • Exploit the compressed representation of the Software Heritage graph (see the article online at http://dx.doi.org/10.1109/SANER48275.2020.9054827) to conduct empirical analyses on subsets of interest of the Software Heritage archive.
  • Propose, implement, and validate experimentally novel ways of exploiting the Software Heritage archive to conduct similar analyses in the future.
  • Produce and curate open datasets mixing and matching software artifacts from Software Heritage and related data sources.

The research activity will take place in a multidisciplinary team involving computer scientists and industry leaders with the purpose of analyzing free/open source software at the scale of Software Heritage. It will also involve documentation of the work and results in the form of conference proceedings and journal papers, and the presentation of the results at scientific meetings.


Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage