HOWTO archive and reference your code
Archiving and referencing properly your source code an important step to support Open Science and to comply with the Know Your Software principle (KYSW). This page provides a concise checklist to do so seamlessly using Software Heritage.
Step 1: prepare your public repository
- add a README file
- add an AUTHORS file
- add license information in one of the two recommended ways
- a LICENSE file at the root of your project, or
- a LICENSES directory containing all the licenses used in your project, and an SPDX compliant copyright header in all your source code files (see the REUSE instructions for details and tools)
- (optionally) add a codemeta.json file containing machine readable metadata (can be produced using the CodeMeta Generator
It is now an accepted practice to also add markdown versions of the README file, but please keep the AUTHORS and LICENSE files as plain text.
Step 2: save your code in Software Heritage
There are several ways to ensure your code is archived, depending on your preferred workflow, and where it is hosted. They are summarised below, pick the one that fits you best.
Manually through the updateswh browser extension
For projects hosted on Bitbucket, GitHub, GitLab.com or any GitLab instance, archival is best triggered manually in just one click by installing the dedicated Updateswh browser extenstion, available for Chrome and Firefox.
Manually through the save code now form
- go to the Software Heritage save code now page
- choose the appropriate version control system in the drop-down list (git, svn and hg are currently supported)
- enter the code repository reference URL
N.B. this must be the url that allows you to clone or checkout your project from the repository! Double check it by cloning/checking out your project in a temporary directory with the appropriate commands if in doubt. Here are examples for git, svn and hg:
- git clone <your url>
- svn checkout <your url>
- hg clone <your url>
- click on the Submit button.
Automatically, via the API
It is possible to trigger archival of a repository programmatically through the Software Heritage API, and this enables you to make archival part of any development workflow. If you use one of the many popular code hosting platforms, we recommend that you use the specific API endpoints designed for the webhooks of these platforms: look at the documentation for the Bitbucket endpoint, GitHub endpoint, GitLab endpoint (and all its instances), and SourceForge endpoint. On GitHub you can also use a GitHub action, that may be simpler to configure, but keep in mind that it uses way more resources behind the scenes than a webhook call, for the same end result.
Step 3: reference your code
To properly reference your code, use the Software Heritage identifier (SWHID), that can be obtained from the Software Heritage archive, as shown below.
Getting the SWHID of a full directory
Navigate to the directory you are interested in (you can use the browser extension to quickly find the archived repository), and pull out the red Permalinks tab, as shown in the image below, then copy the SWHID identifier, or the permalink to the clipboard, for insertion in your documents.
Getting the SWHID for a code fragment
You can also get the SWHID of a file, or a code fragment inside a file. For this, navigate first to the file, select (optionally) the code fragment of interest by clicking on the line number of the first line, and shift-clicking on the line number of the last line. Then, pull out the red Permalinks tab and copy the SWHID identifier or the corresponding permalink.
For LaTeX users
If you use LaTeX for your documents, then you’ll love the biblatex-software package: it lets you produce beautiful bibliographic entries for software, and it supports SWHID natively. Biblatex-software is integrated in CTAN and TeXLive, and works out of the box in Overleaf (here is a template document for the official ACM article style). As of April 2022, biblatex-software is integrated in the ACM article style.
Here is an example of what you can obtain, extracted from the article «Archiving and referencing source code with Software Heritage», ICMS 2020. Try clicking on the links to see the result, watch the tutorial below to learn more, and read the documentation to tweak it to your taste.
 [SW] Roberto Di Cosmo and Marco Danelutto, The Parmap library, 2012. University Paris Diderot and University of Pisa. LIC: LGPL-2.0. URL: https://rdicosmo.github.io/parmap/
 [SW Rel.] Roberto Di Cosmo and Marco Danelutto, The Parmap library version 0.9.8, 2012. University Paris Diderot and University of Pisa. LIC: LGPL-2.0. SWHID: <swh:1:rev:0064fbd0ad69de205ea6ec6999f3d3895e9442c2;origin=https://gitorious.org/parmap/parmap.git;visit=swh:1:snp:78209702559384ee1b5586df13eca84a5123aa82>
 [SW exc.] Roberto Di Cosmo and Marco Danelutto, “Core mapping routine”, from The Parmap library version 0.9.8, 2012. University Paris Diderot and University of Pisa. LIC: LGPL-2.0. SWHID: <swh:1:cnt:d5214ff9562a1fe78db51944506ba48c20de3379;origin=https://gitorious.org/parmap/parmap.git;visit=swh:1:snp:78209702559384ee1b5586df13eca84a5123aa82;anchor=swh:1:rev:0064fbd0ad69de205ea6ec6999f3d3895e9442c2;path=/parmap.ml;lines=101-143>