Public Code Observatory
Public Code ObservatoryExecutive summary
Open source is part of the critical infrastructure of modern government. Public institutions rely on it to deliver services, conduct research, and build digital capacity. They also create code and improve projects maintained beyond their borders. Until now, however, that contribution has been difficult to see: there has been no common, verifiable evidence base showing where public-sector participation happens, whether it lasts, or how it connects countries.
The Public Code Observatory, launched by Software Heritage with the UN Office for Digital and Emerging Technologies (ODET) and UNESCO, is designed to close that gap. It combines the Software Heritage universal Archive of source code with public policy indicators and a collaboratively curated registry of government domains. The result is a repeatable, auditable view of how governments engage with the open-source commons – not a one-off ranking, but a shared measurement capability that can improve over time.
This inaugural analysis covers 199 countries and territories, identifies 92,831 contributors associated with government email domains, and traces their activity across 288,411 archived repositories. It measures contribution, not software use or ownership. Every headline result can be traced to an openly licensed dataset and a documented method.
The central finding changes the policy frame. Across 158 countries the data shows that a country’s open-source code contributions correlate twice as strongly with its digital-government capacity (UN EGDI, Spearman ρ = 0.52) as they do with its GDP per capita (ρ = 0.26). Because statistical variance scales quadratically, that “twice as strong” correlation actually translates to a fourfold difference in actual explanatory power. The takeaway for policymakers is clear: while correlation isn’t causation, participation in global public tech isn’t an exclusive luxury for rich nations. If you build the digital institutions, cultivate the skills, and lay the infrastructure, your country can punch far above its economic weight class.
The analysis also exposes a sustainability challenge. Roughly five out of six repositories receiving a first government contribution fall silent within a year. Public value therefore depends less on launching more projects than on maintaining useful ones, contributing upstream, and rewarding long-term stewardship.
The Observatory turns these findings into a service. Governments, public-sector OSPOs and researchers will be able to verify the results, improve the domain registry, re-aggregate the data, and develop country profiles. By joining this effort, public institutions can move from isolated inventories and anecdotes to a common evidence base for policy, investment and cooperation.
What the Observatory makes visible
Relationship between government-associated contribution activity and four development indicators. UN EGDI has the strongest correlation; GDP per capita is notably weaker.
Government open-source output correlates more strongly with the UN E-Government Development Index than with income, population or internet use. The result remains cross-country rather than being driven only by a few very large contributors.
Why it matters
Digital public capacity is a policy lever. Investment in online-service maturity, connectivity, institutions, and skills can support meaningful participation in shared software infrastructure. The Observatory will allow countries to track whether capacity-building is translating into a durable contribution.
The dataset finds public-sector-associated contributions in every national jurisdiction covered. Yet contributor headcount is highly concentrated: the Americas account for 72% of measured contributors, Africa for under 1%, and the country-level Gini coefficient is 0.93.
Headcount is only one view. The number of distinct projects touched reveals a wider geography of engagement: countries with modest contributor communities may still participate across a broad range of global projects. Reading the two maps together distinguishes scale from reach and points to where targeted capacity-building could have the greatest relative effect.
Two views of global participation
Government-associated contributors by country. Contributors are concentrated in a small number of countries.
Open-source projects with government-associated contributors. Projects are spread across more countries than the contributor count suggests.
From presence to lasting public value
Share of each yearly cohort of repositories with a first government-associated contribution that are still active one year later.
Only about one repository in six remains active a year after its first measured government contribution. Creating or touching code is not enough: public value accumulates through maintenance, security updates, documentation, governance, and continuing upstream participation.
Policy response: Governments must track project continuation alongside initial creation and fund software maintenance explicitly. Public-sector OSPOs should be given a clear mandate to steward dependencies and upstream relationships, while institutional metrics must evolve to recognize sustained contribution to team and program objectives.
Only 1,409 repositories (0.5%) include measured government-associated contributors from more than one country. The strongest links include the United Kingdom-United States (404 shared projects), Australia-United States, and Brazil-United States. European links such as France-Germany are visible but thin.
These repositories show a distinctive form of digital cooperation: public institutions co-maintaining shared infrastructure without first requiring a bilateral program. The small baseline is precisely why the opportunity matters. The Observatory can identify existing links, reveal missing ones, and help partners focus cooperation on software they already depend upon.
Country pairs by number of shared projects among the 30 leading countries, shown on a logarithmic scale.
Government-associated contributions to registered Digital Public Goods, by country and Sustainable Development Goal
Only 10 registered Digital Public Goods across eight countries show measured upstream government contribution. The dominant pattern is therefore deployment without contribution back to the shared code base.
Policy response. Add proportionate “contribution-back” expectations directly into procurement rules, implementation contracts, and digital-cooperation programs. Upstream contribution can reduce duplicated maintenance, improve security and local suitability, and distribute stewardship more fairly.
AI-related work is a growing share of measured government open source and is concentrated in digitally mature countries. The Netherlands leads at 7.2%, approximately 5.1 times the global mean; Japan, Singapore, and India are also prominent, while Latin America trails.
This is an early, indicative measure, not a complete inventory of public-sector AI. It nevertheless offers a useful forward signal: the same institutional capacity associated with wider open-source contribution also shapes who can participate in the development of public-interest AI.
Countries with the highest AI-related share of measured government-associated open-source output.
From evidence to action
The findings suggest a focused first agenda for the Observatory and its participating governments:
- Make national contribution visible. Validate government domains, distinguish agencies and levels of government, and publish a country profile that can be checked and improved.
- Reward continuity, not project counts. Use one-year and multi-year continuation, upstream activity and maintenance health as core public-code indicators.
- Build contribute-back into public spending. Ask suppliers and programme partners to identify upstream dependencies and return generally useful improvements whenever feasible.
- Target capacity where the headroom is greatest. Connect digital-government programmes with contributor skills, public-sector OSPOs, and maintainership pathways, particularly in underrepresented regions.
- Turn shared dependencies into cooperation. Use the evidence base to identify software already shared across borders and organise joint stewardship around it.
- Improve reuse readiness. Address the large share of repositories without a clear licence and the near absence of citation metadata through simple, standard requirements.
A trusted and evolving evidence base
What the Observatory provides
- Archive-backed evidence. Measurements are computed over source code preserved by Software Heritage, including projects no longer available from their original host.
- Reproducible analysis. Figure-level tables, a reproduction guide, and pseudonymized project data allow independent verification and re-analysis.
- Comparable country profiles. A common framework supports national assessment without pretending that one number can capture every dimension of public code.
- Collaborative curation. Governments and experts can improve domain coverage, agency classification, and policy interpretation.
- Longitudinal monitoring. Repeated editions can show whether participation becomes broader, more sustainable, and more collaborative.
Scope and limits
The Observatory is clear about its boundaries because credible public metrics demand transparency.
- Contribution, not ownership or use. The analysis counts repositories touched by contributors using identified government email domains. It is not an inventory of government-owned software or open-source deployment.
- An evolving archive horizon. Time series are conservative through 2022. Activity in 2023-2024 is undercounted because of archive ingestion timing and must not be read as a policy reversal.
- Uneven domain structures. The United States uses hundreds of agency subdomains, while many countries aggregate activity under a small number of domains. This affects raw contributor shares and the granularity of agency comparisons.
- Coverage is still improving. Some regional data has not yet been integrated. For example, the current data for Germany only covers the federal level, with state-level (Länder) repositories pending.
- Indicative thematic classifiers. DPGs, SDGs, and AI tags are conservative lower bounds based on registries and keyword methods. Exact counts are floors, not exhaustive inventories.
- Association, not proof of causation. Correlations with United Nations E-Government Development Index (EGDI) and other indicators support comparison and hypothesis-building; they do not alone establish a causal relationship.
Far from undermining the work, these limitations define exactly where the evidence is strongest, prevent overreaching conclusions, and set a clear roadmap for future updates.
Go deeper
- Full report: the complete analysis — every section and every chart, in a single readable document.
- Comprehensive edition: the most detailed report — an extended deep-dive into preservation, resilience, hosting, and longevity.
- Companion dataset and reproduction guide: openly licensed analysis tables, data, and method.

