Source code is spread around a broad spectrum of code hosting platforms and websites, and developed using a variety of version control systems. In order to collect and archive it properly, Software Heritage needs adaptors for each of the platforms and version control systems out there. This is a humbling task, that needs expert contributions.
This is why we are partnering with funders around the world to provide grants for experts that are willing to engage with the long term mission of Software Heritage. This page lists these grant programs, together with the grants already awarded, and information on how to apply for the programs that are still open.
A grant from Alfred P. Sloan Foundation has been awarded to Software Heritage specifically to foster the emergence of a community of expert contributors to increase the coverage of the Software Heritage archive.
Founded in 1934 by industrialist Alfred P. Sloan Jr., the Foundation is a not-for-profit grantmaking institution that supports high quality, impartial scientific research; fosters a robust, diverse scientific workforce; strengthens public understanding and engagement with science; and promotes the health of the institutions of scientific endeavor.
By receiving this first subgrant from Alfred P. Sloan Foundation, Cottage Labs will bring to bear its expertise to develop a SWORD based connector that will allow all instances of InvenioRDM to safely and efficiently archive in Software Heritage the source code of all research projects that will be deposited in them, and to provide the corresponding intrinsic identifiers (SWHID) to the research community.
A second subgrant from Sloan Foundation has been awarded to Stefan Sperling to contribute to expand the Software Heritage archive.
Stefan will be working on CVS and Subversion loaders.
CVS and Subversion were in widespread use in the 90s and 2000s, beforethe advent of distributed version control systems like Git or Mercurial.
A fourth subgrant has been awarded to OCamlPro, that will help increase the coverage of the Software Heritage archive by integrating it with the OCaml ecosystem.
OCamlPro will be archiving source code from Opam packages, and support falling back to the Software Heritage archive in case it cannot be found in its original location, leveraging the SWHID identifiers.
Castalia Solutions will be developing the Maven Repositories connector to archive the Maven ecosystem. This will enable the identification and retrieval of source code from Maven repositories, parsing POMs to follow the Maven graph, and linking artifacts to their development place and original source code development repository.
Using the Maven connector the Software Heritage archive will be able to access millions of artifacts from the major repositories: Central, Spring, Atlassian, JBoss.
A sixth subgrant has been awarded to Timothy Sample.
Disarchive can disassemble software archives into data and metadata. The goal is to create a small amount of metadata that can be used to recreate a software archive bit-for-bit from the original files. For example, a software archive made using tar and Gzip will need to describe the order of files in the tarball and the compression parameters used by Gzip.
HashBang is a company based in Lyon, France. Our team is comprised of 17 software engineers, data scientists and web designers in a workplace that gathers different cooperating companies, whose core business is to provide advice in software engineering; our expertise is based on open-source software such as GNU/Linux, Python and Django. The way it works is inspired by holacracy.
Thanks to Sloan Foundation, Hashbang will help to expand the coverage of the Software Heritage archive, by enabling the archival of 3 new software sources: the Crates.io RUST package repository, and the Arch Linux official and community packages (AUR).
To this end, Hashbang will create the appropriate “listers” and “loaders” that will be integrated into the Software Heritage infrastructure.
A grant from the NLNet Foundation, has funded work that allowed Software Heritage to save 250.000 endangered Bitbucket repositories, improve its Mercurial loader and get connectors with Nix and Guix.
Since 1997 NLnet foundation (after its historical contribution to the early internet in Europe) has been financially supporting organizations and people that contribute to an open information society. It funds those with ideas to fix the internet through a fast, competitive and open process.
Do not hesitate in submitting a proposal related to Software Heritage directly to NLNet
Tweag is a software innovation lab that helps deep tech startups quickly scale their engineering performance and execute on high-risk, high-reward projects with confidence. They develop the components needed to ensure that the source code used to build packages using the Nix and Guix functional package managers is systematically archived in the Software Heritage archive.
Thanks again to the NLNet Foundation for ensuring this grant dedicated to Tweag.
Easter-eggs is a company well known for its commitment to work exclusively with Free Software and with extensive expertise on the Ceph object storage technology. Thanks to a grant from the NLNet foundation, they will implement the next-generation object storage for Software Heritage, allowing to store billion of source code file efficiently.
Obsidian Systems is a software consultancy specializing in haskell, nix, curiosity, and innovation. Founded in 2014, Ali Abrar and Ryan Trincle after graduating from Harvard Law and spending several years working with tech startups in Boston and New York City.
The incredible amount of waste they witnessed during those years
convinced them that there had to be a better way to build products. This conviction drives everything we do at Obsidian. It informs our technology choices and our investments in open source. Most importantly, it affects how we build products.