Close

March 1, 2024

Big Data Development and Architecture Engineer

The Software Heritage project

Software Heritage is a universal software source code archive project, whose aim is to recover, preserve for the very long term and share all publicly available source code, together with its development history (e.g., as stored in version control systems). The Software Heritage archive already contains over 17 billion unique source files and 3.6 billion commits, retrieved from over 266 million software development projects. The Software Heritage initiative, hosted by the Inria Foundation, is an entirely free software (FOSS) and non-profit project.

The Position

We are looking for an experienced Big Data-oriented software engineer. The ideal candidate will have significant interest and experience in large-scale data processing and exploitation architectures, including storage, indexing and retrieval.

You can consult a more detailed list of our current projects on the Software Heritage Roadmap 2023 (https://docs.softwareheritage.org/devel/roadmap/roadmap-2023.html)

Main tasks and activities

– Setting up a data processing architecture (a la Spark)
– Design and modeling of Big Data architectures
– Implementation of solutions based on defined architectures
– Set up Big Data pipelines

Skills

The ideal candidate will have experience in Big Data development and architecture, preferably in an open-source context. We expect self-organization and autonomy skills commensurate with the candidate’s experience. Participation in existing FOSS projects in any capacity (developer, community organizer, technical writer, etc.) is an added advantage.

The following skills are expected:

– Mastery of a large-scale data processing system (e.g. Apache Spark, Flink, or Hadoop)
– Fluent software development skills (basics in Rust and Python)
– Good level of English (written and spoken)
– Use of Git
– Use of continuous integration tools (e.g. Gitlab and/or Jenkins)

Knowledge and experience of the following will be considered an asset:

– Experience in data processing on a scale of tens of terabytes or even petabytes
– Experience with Cassandra and Kafka
– Knowledge of Java
– Knowledge of Kubernetes
– Data visualization

Software Heritage is a complex technical architecture, based on many different technologies, which continues to evolve. We do not expect candidates to master all of them, but rather to be open to discovery and learning. Prior knowledge of one or more of the above-mentioned subjects will help in the process of getting to grips with the project, but we encourage you to apply whatever your level of experience in these technologies.

Working conditions

We are a team of 15 people, including 9 technical staff (5 developers and 4 sysadmins).
Autonomy, transparency and consultation are at the heart of our values (the project is free and open source).

Most of the team is based at the Inria center in Paris, but the position is open to any location in France close to an Inria center (Bordeaux, Lille, Lyon, Grenoble, Rennes, Saclay, Sofia Antipolis, Nancy).

The contract offered by Inria is a 2-year renewable full-time fixed-term contract, with the prospect of a permanent position.
– Telecommuting: 90 days/year (average 2 days per week)
– Vacation: 35 days + 10 days RTT
– Salary range: 30 to 70 k€ depending on profile and experience.

Application

Please send your application (CV + cover letter) to hiring@softwareheritage.org

March 1, 2024