Close

mars 12, 2024

Highlights of the 2024 Software Heritage Symposium

We’re excited to share the news that the Software Heritage Symposium 2024 materials are now accessible online: those who couldn’t join us in person for our annual event on February 1st, 2024 will now find video recordings of all sessions and the detailed slide decks from our speakers on the event webpage.

Over the past few years, the Software Heritage Symposium has evolved into a cornerstone event that unites diverse stakeholders from different sectors. The momentum began with the inaugural Symposium in 2021, which celebrated Software Heritage’s 5th anniversary. The momentum continued with a second Symposium in 2023, further solidifying the importance of our mission. Most recently, in February 2024, we convened the 3rd Symposium, reinforcing our commitment to safeguarding the vast knowledge embedded in software source code.

Organized in collaboration with UNESCO, the event provided an opportunity for attendees to explore the varied impacts and applications of software source code archival across different sectors. By bringing together a diverse group of presenters and panelists, the event fostered a rich exchange of ideas and experiences, highlighting the ongoing significance of software source code in society and the importance of preserving our digital legacy. Here, we delve into the key insights and discussions that emerged during this enlightening event.

UNESCO and Inria’s commitment to Software Preservation

Launched in 2016 by Inria, Software Heritage, is a pioneering initiative to collect, preserve, and share all software source code as a common infrastructure at the service of cultural heritagescienceindustry, and society as a whole.

In a significant milestone, UNESCO and Inria joined forces in 2017, with the formal partnership agreement, to collaborate on preserving and disseminating software source code knowledge. This partnership reflects a commitment to safeguarding our digital legacy for future generations, enhancing access to information, and supporting global innovation and education.

Symposium Highlights

 

UNESCO – Paris | © Inria / Photo M. Magnin

The welcome address and opening of the third annual Symposium was launched by Mr Fackson Banda, Chief of the Unit for Documentary Heritage at UNESCO, whose Documentary Heritage Unit is the Secretariat of the Member of the World (MoW) programme, co-host of the event.

In her opening remarks, Ms Marielza Oliveira, Director Communications and Information – Division for Digital Inclusion, Policies and Transformation, UNESCO, emphasized the pivotal role in preserving and sharing the invaluable cultural heritage embedded in software code, « software is the lifeblood of our digital age permeating every aspect of our lives and driving innovation across industries »

Marielza Oliveira at UNESCO – Paris | © Inria / Photo M. Magnin.

Mr Gilles Mathieu, representative, French Ministry of Research and Higher Education, highlighted the ministry’s commitment to open science by advocating for the promotion and preservation of research-generated source code through archiving in Software Heritage.

 

 

Roberto Di Cosmo at UNESCO – Paris | © Inria / Photo M. Magnin

 

Mr Jean-Frédéric Gerbeau, Inria Deputy CEO for Science, reminded the challenges that Software Heritage met in harvesting the software source code, as he puts, arguably more difficult operation than referencing the web itself. Recognized that the Software Heritage archive has truly flourished, emphasizing the recent projects that pave the way to new opportunities.

Reinforcing the role of software and its source code being a fundamental enabler in all human activities and setting the scope for the symposium, Mr Roberto Di Cosmo, Director of Software Heritage, stated:

“Software Heritage is building a universal source code archive as one infrastructure, shared and mutualized across all fields of endeavour, because software is the digital fabric that binds all them together”

 

Following the welcome address, the main topics were discussed in panels and presentations:

  • Empowering Innovation through Software Source Code in Industry and Governments
  • Scientific Challenges in Analyzing and Learning from Preserved Source Code
  • Software Source Code in the Open Science Ecosystem
  • Software Source Code as Documentary Heritage

Industry & Public Administration Panel

Kate Stewart at UNESCO – Paris | © Inria / Photo M. Magnin

The first panel introduced topics from industry and public administration and Industry, moderated by Mr Roberto Di Cosmo and including: Mr Guillaume Avrin, National coordinator for artificial intelligence, Direction Générale des Entreprises, Mr Omar Mohsine, Office of the United Nations Special Envoy on Technology, Mr Marc Palazon, Board member and President of the Open Source commission, Numeum and Ms Kate Stewart, VP Dependable Embedded Systems, Linux Foundation.

Ms Kate Stewart emphasized the importance of reproducibility and the need of an infrastructure for tracking the source of truth, where Software Heritage and the SWHID play a key role. Mr Omar Mohsine shared that the UN engagement toward Open Source by building upon a strategy composed by three pillars: policy, culture change and the Open Source heroes. In France, the public administration understood that Open Source is key and today the biggest market in Europe for Open Source is in France, stated Mr Marc Palazon.

Mr Guillaume Avrin discussed the national strategy for AI and the announcement in 2023 of the extension of the super-computing facility on the Plateau de Saclay.

Guillaume Avrin, Roberto Di Cosmo, Omar Mohsine, Marc Palazon et Kate Stewart at UNESCO – Paris |© Inria / Photo M. Magnin

Scientific Challenges: Rust Analytics for Software Heritage

Sebastiano Vigna at UNESCO – Paris | © Inria / Photo M. Magnin

Mr Sebastiano Vigna presented Rust Analytics for Software Heritage, as well as the graph representation approach with the WebGraph framework. Showing the  results of a collaboration between Télécom Paris and the Università degli Studi di Milano, on a new graph compression framework written in Rust on the Software Heritage history graph. This innovative approach for representing large graphs opened new possibilities with a more predictable performance, and three times faster!  Preparing for the future growth of Software Heritage.

Scientific Challenges: Big Code

Landro v. Werra, Hugging Face and Harm de Vries, Staff research scientist, ServiceNow Paris | © Inria / Photo M. Magnin

Mr Leandro von Werra and Mr Harm de Vries offered insights into the development of large language models for code during their presentation. They showed how the BigCode project that they lead builds the most open and transparent models available today, making available all the data collection, filtering and training pipeline, and providing tools for developers to check whether their code is in the training dataset, to support opt out. They detailed the reasons for establishing a collaboration with Software Heritage: a shared engagement to contribute to a common good. « We partner with Software Heritage  to ensure that the source code used to build the models is accessible and identifiable, enhancing the transparency of our efforts. »

Open Science Panel

The following session included lively discussions by the Open Science Panel, which demonstrated how software source code is not only a tool for the preservation of the world’s software heritage but also an instrument at the service of Open Science emphasizing the importance of Open Source as an enabler for Open Science.

Mr Christopher S. Marcum, Senior Statistician and Senior Science Policy Analyst at the Office of the Chief Statistician of the United

States, shed light on the substantial source code repository maintained by the U.S. Bureau of Statistics (USB) on GitHub. « The big thing here that is relevant for Software Heritage is that by U.S. federal policy, federal agents are required to share their open-source code… » noting that it is a policy of the United States government for federal agencies to share, at a minimum, the metadata associated with their code repositories on code.gov, aligning with executive orders previously mentioned.

Ms Claudia Bauzer Medeiros, Professor at the University of Campinas (UNICAMP), declared that « Software Heritage is a treasure, » while quoting Paul Valéry

It depends on those who pass

Whether I am a tomb or treasure

Claudia Bauzer Medeiros at UNESCO – Paris |© Inria / Photo M. Magnin

Whether I speak  or am silent

The choice is yours alone.

~ Paul Valéry

Reminding us that « unless you understand its nature and reuse it, unless you take advantage of this treasure, we must make it abundantly clear that it’s not a mere repository to be buried as a tomb. » Claudia’ words highlighted the importance of recognizing and actively engaging with the wealth of knowledge preserved within Software Heritage, emphasizing its potential to drive innovation and progress in the digital age.

Katlin Thaney at UNESCO – Paris | © Inria / Photo M. Magnin

Ms Kaitlin Thaney, Executive Director of Invest in Open Infrastructure, encouraged attendees to broaden their perspective on open research during her address, emphasizing the synergies between open research and other areas of innovation. « If you rely on any form of open research in your work in research and development, it’s part of your process, » she asserted. Kaitlin urged participants to consider not only the systems they utilize but also to explore open alternatives and ways to give back. She emphasized the importance of collaboration beyond government and philanthropic support, as they alone are unable to sustain long-term efforts. Kaitlin extended her gratitude and congratulations to Software Heritage for their leadership in this endeavor, concluding her remarks with an invitation for attendees to reflect on these principles.

Mr Bhanu Neupane, Programme Manager for ICT and Sciences and Open Access to Scientific Research at UNESCO, highlighted the need for reliable indicators to gauge the impact of free and open-source software on driving the open science agenda globally. « It’s crucial to develop indicators that member states can use to measure the extent to which free and open-source software contributes to scientific research in their countries ».

The Open Science panel at UNESCO – Paris | © Inria / Photo M. Magnin

Mr Roberto Di Cosmo, CEO of Software Heritage, underscored the universal nature of software during his remarks. « Software is designed for all, » he said. « Not just for specific countries, research areas, or industries, but as a common infrastructure for humanity. » Roberto stressed the importance of communication and collaboration with diverse communities, acknowledging the complexity of the open-source ecosystem. He challenged the notion that open source alone guarantees value, highlighting the need for quality control and decision-making in software development and funding. Roberto outlined Software Heritage’s mission to provide a centralized infrastructure accessible to all, aiming to streamline the multitude of individual repositories into a unified resource for the future.

Software Source code as part of Memory of the World Panel

The last session was a Panel on Software Source code as part of Memory of the World moderated by Mr Fackson Banda, which addressed different aspects of digital cultural heritage and the place of source code the larger ecosystem of cultural heritage preservation and accessibility.

Rosana Lanzelotte at UNESCO – Paris | © Inria / Photo M. Magnin

Ms Rosana Lanzelotte, President of Musica Brasilis, showcased their collection of 6283 free sheet music downloads of Brazilian music scores, emphasizing the need for Music Character Recognition (MCR) software to decode handwritten scores. Aligning with FBG 11.4, Musica Brasilis contributes to preserving cultural heritage by going digital and promoting accessibility while adhering to the FAIR principles. Their collaboration with IICT for digital preservation and interoperable metadata exchange underscores their commitment to open access. Rosana highlighted the adaptability of Musica Brasilis’s web software, which can now support initiatives beyond its initial scope, promoting cooperation and potential contributions to networks like Software Heritage.

Mr Pio Pellizzari, Delegate of IASA, emphasized the importance of digital archiving for preserving access to invaluable audiovisual materials, highlighting the need for comprehensive documentation and collaboration to address evolving preservation needs effectively.

Valérie Schaferat UNESCO – Paris | © Inria / Photo M. Magnin

Ms Valérie Schafer, Professor at C2DH – University of Luxembourg, expressed her enthusiasm for the Software Heritage mission, highlighting its transparent technical processes and the open large community. Furthermore, Valérie quoted Lawrence Lessig’s statement « code is law » and shared the broader implications of code as a cultural artifact, shaping societal discussions around politics, gender, and ideology. She emphasized the need to engage citizens in understanding coding basics and its societal impact, envisioning a future where coding literacy becomes more widespread. Reflecting on her own research in web archives, Valérie explored the potential of hidden layers in archived web pages and praised Software Heritage’s efforts to narrate the history of code, acknowledging the diverse range of contributions from scholars in the field.

Conclusion

UNESCO – Paris | © Inria / Photo M. Magnin.

The Software Heritage 2024 Symposium served as a testament to the collective commitment to preserving our software commons. As we embark on the journey ahead, we would like to take this opportunity to thank UNESCO for their continuous partnership, our sponsors for their support and our community for their engagement. Let us continue to nurture our digital heritage while building a large community, ensuring that future generations have access to the universal source code archive.

 

Didn’t make it to the event? No worries!

Discover the slides to catch up on the presentations, view the snapshots from the day on our event webpage, and watch the sessions online.

Visit our Symposium 2024 webpage!

mars 12, 2024