Software Heritage: Ethical Charter for Mirrors


Software is at the heart of our digital society and embodies a growing part of our scientific, technical and organizational knowledge. As a consequence, software source code is now a growing part of our cultural heritage and a valuable asset for education, research, and industry.

The core mission of Software Heritage is to ensure that this precious body of knowledge will be preserved over time and made available to all, by collecting, preserving, and sharing all available software in source code form. Together with its complete development history. Forever.

We do this for multiple reasons. To preserve the scientific and technological knowledge embedded in software source code, that is a significant part of our heritage. To allow better software development and reuse for society and industry, by building the largest and open software knowledge database, enabling the development of a broad range of value-added applications. To foster better science, by assembling the largest curated archive for software research, and building the infrastructure for preserving and sharing research software.

We do this now, because we are at a turning point: on one side, the founding fathers of computer technology are still around, and willing to contribute their knowledge, but only for a limited time. On the other, we seem to be at increasing risk of massive loss of source code developed collaboratively, because of hosting sites that shut down when their popularity decreases, and the lack of a structured effort to archive software artefacts.

We recognize that building a long-term common infrastructure is a challenging undertaking.

To maximize the chances of success in the long term.

  • we have established Software Heritage as a non-profit, multi-stakeholder initiative open to a variety of partners from civil society, academia, industry, and governments;

  • we will develop and release all the software components needed to operate its services under a Free and Open Source Software license, welcoming contributions to it;

  • we will build replication and diversification in the system at all levels, fostering the emergence of a geographically distributed network of mirrors, that may be implemented using a variety of storage technologies, in various administrative domains, controlled by different institutions, and located in different jurisdictions.

Mirrors play an important role and are entrusted with significant power: they receive a full copy of the contents of the Software Heritage archive, they work closely with Software Heritage to create and maintain their mirror, they have the possibility of developing and experimenting technologies alternative to those made available by Software Heritage, and their voice may be heard in the Software Heritage governance.

But with power comes responsibility, and this Ethical Charter highlights the principles that Mirrors commit to respect. This Ethical Charter may evolve over time, and changes will be announced on the Software Heritage website.

Avoid harm

The source code collected in the Software Heritage archive enables a broad range of analysis and applications, in many areas of research. Well-intended actions, including those that are accomplished for purely research purpose, may lead to harm. When researchers request access to the full dataset, Mirrors will ask them to refrain from redistributing the dataset, and to declare any potential ethical issue arising from their use of the data. Mirrors will also ask researchers to commit to the Code of Conduct made available by Software Heritage before accessing significant parts of the archive.

Protect Personal Data

The Software Heritage archive collects publicly available source code, and its development history, from a variety of public sources. Any personal information that may be contained in the source code or in the development history will hence be collected in the archive, of which Mirrors obtain a full copy as part of their mission. Even where the local legislation does not make it mandatory, Mirrors will strive to adopt processes, policies and terms of use that protect personal data in general, and in particular to safeguard from abusive behavior the people that through their work and dedication created the very software commons we are preserving. Mass mailing software developers is a well-known example of misuse that is clearly unacceptable, but there may be many other ones.

Maintain coherent terms of use

Mirrors will adopt as much as possible terms of use that follow the spirit of the terms of use designed and adopted by Software Heritage. Restrictions in the terms of use that are introduced to reduce harm and protect personal data should not be removed, event if the local legislation does not impose them. If additional restrictions are imposed by the local legislation, they should be incorporated only to the extent needed to comply with the legislation.

Ensure fair and non discriminatory access

Mirrors will implement terms of use for accessing the archive contents, be it through a public API or user-facing applications that are sufficiently permissive to comply with the approach and sense of this mission. In particular, access to the Mirror should be fair and non-discriminatory. Usage restrictions should be limited to those necessary to ensure proper operation of the Mirror’s infrastructure, and protection from abusive behavior.

Foster Collaboration

The long-term success of Software Heritage depends on creating a community that shares the project mission and contributes to it. Particularly welcome contributions include the development of policies, technologies, and infrastructure needed to maintain and expand both the archive and the Mirror network. Mirrors will maintain a fluid communication with Software Heritage and the Mirror network, sharing all propositions, issues and concerns with the community, striving to find solutions collaboratively. Mirrors will contribute back to Software Heritage all improvements to its software and encourage collaboration. Mirrors are free to develop and experiment technologies alternative to those made available by Software Heritage, provided they maintain full compatibility with Software Heritage; they are encouraged to share their findings with the community.