Home Page - Software Heritage

Software [is our] Heritage

Software is fragile

unlike words carved in stone it can be deleted or get corrupted

Software is the key

to access information and a fundamental part of human heritage

Software Heritage

preserves software source code for present and future generations

We are building the universal software archive

Browse the archive

We collect and preserve software in source code form, because software embodies our technical and scientific knowledge and humanity cannot afford the risk of losing it.

Software is a precious part of our cultural heritage. We curate and make accessible all the software we collect, because only by sharing it we can guarantee its preservation in the very long term.

Discover our mission

Use the Archive

Browse

You can search which software origins (repositories, source packages, etc.) we have already archived and when we have visited it, implementing a “wayback machine” for source code. Once an origin of interest has been identified, the web app allows to browse through it as you usually do with version control system browsing interfaces.

Browse the archive

Save Code Now

You can trigger instant archiving of any source code repository that is not yet ingested in the Software Heritage archive, or that is not up to date.

This complements the regular crawling of software origins that is performed on the main code hosting platforms, and gives you the possibility of ensuring that the code you are interested in is properly archived.

Save Code Now

Archive Research Software

You can archive seamlessly your research software artifacts, and add to your research articles precise references to specific versions of the source code, down to fragments of individual source files. Just follow the link below for the guidelines.

Save research software

Rescue Legacy Code

You can contribute to rescue and curate landmark legacy source code while it is still possible to get hold of it, and talk to the people that created it. For this, you can follow the SWHAP process, developed in collaboration with UNESCO and the University of Pisa.

Rescue legacy code

Do we already have your code?

We harvest publicly available source code from many software projects and keep up with development happening there. As of today our archive already contains and keeps safe for you:

API

Programmatic access to the content of the archive is available via the Software Heritage API.

The API allows to navigate the archive as a graph of development-related objects, such as file contents, directories, commits, releases. With the API developers can lookup individual objects by their IDs, retrieve their metadata, and jump from one to another following links — e.g., from commits to the corresponding directories or parent commits, from releases to released commits, etc. The API also allows to retrieve crawling information, such as tracked software origins and the full list of visits performed on each of them. This allows, for instance, to know when snapshots of a specific Git repository where taken and, for each of them, where each branch was pointing at the time.

Read the API documentation

It concerns all of us. And you can help!

Software is so pervasive in our lives that its preservation concerns all of us. Our mission and the archive we are building will serve the needs of the many, from cultural institutions to scientists and industries.

Everyone can help us achieving these ambitious goals.