Big Data Development and Architecture Engineer

Contract type : Fixed-term contract

Renewable contract : Yes

Level of qualifications required : PhD or equivalent

Fonction : Temporary scientific engineer

Corps d'accueil : Ingénieur de Recherche (IR)

Level of experience : From 5 to 12 years

Context

Software Heritage is a universal software source code archive project, whose aim is to recover, preserve for the very long term and share all publicly available source code, together with its development history (e.g., as stored in version control systems). The Software Heritage archive already contains over 19 billion unique source files and 4.2 billion commits, retrieved from over 300 million software development projects. The Software Heritage initiative, hosted by the Inria Foundation, is an entirely free software (FOSS) and non-profit project.

Assignment

We are looking for an experienced Big Data-oriented software engineer. The ideal candidate will have significant interest and experience in large-scale data processing and exploitation architectures, including storage, indexing and retrieval.

You can consult a more detailed list of our current projects on the Software Heritage Roadmap 2024 (https://docs.softwareheritage.org/devel/roadmap/roadmap-2024.html)

Main activities

– Setting up a data processing architecture (a la Spark)
– Design and modeling of Big Data architectures
– Implementation of solutions based on defined architectures
– Set up Big Data pipelines

Skills

The ideal candidate will have experience in Big Data development and architecture, preferably in an open-source context. We expect self-organization and autonomy skills commensurate with the candidate’s experience. Participation in existing FOSS projects in any capacity (developer, community organizer, technical writer, etc.) is an added advantage.

The following skills are expected:

– Mastery of a large-scale data processing system (e.g. Apache Spark, Flink, or Hadoop)
– Fluent software development skills (basics in Rust and Python)
– Good level of English (written and spoken)
– Use of Git
– Use of continuous integration tools (e.g. Gitlab and/or Jenkins)

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage

Remuneration

Remunerating based on diploma and professional experience