CodeCommons annual review – 2025-09-22
CodeCommons celebrated its first anniversary by bringing the community together for a plenary event at Inria Paris on September 22, 2025. The day was dedicated to reviewing our progress, sharing a common vision, and diving into the technical work that’s bringing the project to life.
Shared vision and status update
The morning session set the stage with a look at the big picture and a comprehensive status report:
- Introduction and vision: Roberto Di Cosmo kicked off the day, reinforcing the mission and future path of CodeCommons.
- Status report: Benoit Chauvet and team representatives provided detailed updates on all ongoing activities, showing how the individual pieces are coming together.
Technical tracks and featured presentation
The afternoon split into parallel technical tracks, allowing for in-depth presentations from each task:
- Technical track 1: Focused on the core plumbing: Infrastructure and the Unified Data Model.
- Technical Track 2: Explored advanced topics like Code analysis, similarity, and AI preferences.
- Guest Talk: The day concluded with a special presentation on the Unified Data Architecture at Netflix, delivered by Alexandre Bertails.
A meaningful milestone
Thanks to everyone who joined our annual review! Your engagement and thoughtful questions made the event lively and meaningful. It was fantastic to gather the community, share essential information, and reinforce our common vision. It’s truly encouraging to see the project taking shape and the pieces coming together. Check out all sessions below.
Morning session
Introduction and vision, Roberto Di Cosmo
Status report, Benoit Chauvet and team
DiverSE Team activities review (pre-recorded)
Project context metadata collection
Programming language identification
Download morning plenary slides
Technical track 1: Infrastructure and unified data model
Slides
– HPC Infrastructure (Simeon Carstens)
– SWH Fuse (Martin Kirchgessner)
– Storage Compression (Francesco Tosoni)
– Open Source Vulnerabilities (Valentin Lorentz)
– Collect project context data (Caroline Landry)
– Programming Language detection (Baptiste Mehat)
– License detection (Philippe Ombredanne)
Technical track 2: Code analysis, similarity, and AI preferences
Slides
– LLM baseline training (Djamé Seddah)
– Similarity detection (Gaël de Chalendar)
– Similarity detection (Leonardo Venuta)
– Plagiarism detection (Andrea Gurioli)
– Design patterns Detection (Yassine Abdeljalil)
– AI preferences (Thomas Aynaud)