Beyond Stargate: Open code and the AI black box
Experts from France, Brazil, and the UAE explore how open-source code and transparent archives provide the essential foundation for digital sovereignty, ethical development, and linguistic inclusion.
Experts from France, Brazil, and the UAE explore how open-source code and transparent archives provide the essential foundation for digital sovereignty, ethical development, and linguistic inclusion.
CodeCommons is testing the limits of swh-fuse using large-scale clusters. Preliminary experiments ran on the 10,000-core Kraken cluster. The system validated performance by hitting an optimal file storage rate of 30,000 reads per second and sustained 8,000 file writes per second.
A recent talk by Director Roberto Di Cosmo highlights how 10 years in, Software Heritage aims its ‘large telescope’ at the future of code.
CodeCommons aims to provide a centralized repository of essential resources, including code, documentation, and metadata, to facilitate the creation of smaller, more effective datasets for the next generation of AI tools.
CodeCommons is a two-year project building on the Software Heritage archive. Here’s an overview of the projects we and our partners are working on.
CodeCommons, a two-year project funded by the French government, is building on Software Heritage—the world’s largest public source code archive—to create higher-quality datasets for responsible artificial intelligence.