Software source code is a precious asset and an endangered heritage that we need to take care of, and this is why Software Heritage has the set goal to collect all publicly available software in source code form together with its development history, ensure its long term preservation, and make it available to all.
Today, we are delighted to announce that the Software Heritage archive has passed the symbolic mark of the one hundred million archived projects.
These projects are stored, with their full development history, in a giant Merkle graph that contains over 7 billion unique source code files, and over 1.5 billion commits.
We harvest publicly available source code from many code hosting platforms and keep up with development happening there. We are sincerely grateful to one of these, GitHub, for going the extra mile to make our harvesting work easier, and hope that many other will follow the example.
You can explore right now the Software Heritage archive.