December 10, 2020

Improving the Subversion loader and adding a CVS loader

As announced before, we received a grant from the Alfred P. Sloan Foundation to support contributions from experts that want to help increase the coverage of the Software Heritage archive. After a first subgrant awarded to Cottage Labs  with a particular focus on Open Access, Open Science and Open Source, we are excited to unveil a second subgrant, awarded to Stefan Sperling, that will be working on improving the current Subversion loader and developing a CVS loader.

CVS and Subversion are version control systems that were in widespread use in the 90s and 2000s, before the advent of distributed version control systems like Git or Mercurial.

Their declining popularity means that expertise related to these version control systems will become scarce, and we have not much time to properly collect and preserve the many CVS and SVN repositories which are still accessible online.

The existing Subversion loader will be reviewed and improved, with the goal of increasing conversion accuracy and performance.

A new CVS loader will be developed, in order to preserve CVS repository histories which were never migrated to other version control systems. Even though most open source projects have already migrated away from CVS, in many cases the migration process is not faithful to the original CVS history and the original repository is the only source of truth.

We also hope that CVS and Subversion repositories which have been hidden from public view, such as internal repositories used by research departments or companies, will become available in the future so that they can be preserved by Software Heritage.

If you too want to get involved and address the long tail of the many other diverse platforms that are used for developing, distributing and/or archiving publicly available source code, just fill this simple form to start the process!

December 10, 2020