The mission of Software Heritage is to collect, preserve and share all the publicly available source code. With more than 6.4 billion source files from more than 91 million projects, including Debian, GitHub, GitLab, Gitorious, GoogleCode, GNU, Python Package Index and more, the Software Heritage archive has the unique ability […]
Read post
Back in November 2016, Nicolas Dandrimont wrote about structural code changes leading to a massive (+15 million!) upswing in the number of repositories archived by Software Heritage through a combination of automatic linkage between the listing and loading scheduler, new understanding of how to deal with extremely large repository hosts […]
Read post
You might have noticed a “slight” bump in the amount of projects referenced on our archive page over the last few days. Did we suddenly uncover a treasure trove of new repositories? Well, not yet… This bump comes from a change in our process that is a stepping stone towards […]
Read post
When we started Software Heritage we did not expect that publicly available source code was going to be endangered so quickly. Among other major source code hosters, Gitorious was already being taken down in mid 2015, before our initial infrastructure was ready! Luckily fellow hackers of the Archive Team were already working full […]
Read post