CodeCommons is testing the limits of swh-fuse using large-scale clusters. Preliminary experiments ran on the 10,000-core Kraken cluster. The system validated performance by hitting an optimal file storage rate of 30,000 reads per second and sustained 8,000 file writes per second.
Read post
A recent talk by Director Roberto Di Cosmo highlights how 10 years in, Software Heritage aims its ‘large telescope’ at the future of code.
Read post
CodeCommons aims to provide a centralized repository of essential resources, including code, documentation, and metadata, to facilitate the creation of smaller, more effective datasets for the next generation of AI tools.
Read post
CodeCommons is a two-year project building on the Software Heritage archive. Here’s an overview of the projects we and our partners are working on.
Read post
CodeCommons, a two-year project funded by the French government, is building on Software Heritage—the world’s largest public source code archive—to create higher-quality datasets for responsible artificial intelligence.
Read post