Virtually Attend FOSDEM 2026

Software Defined Storage Track

2026-01-31T10:30:00+01:00

OpenCloud has the design goal to not use a relational database. This requires a deeper integration with the underlying storage system, ie. through extensive use of extended file attributes. Since features like file revisions, trash and shares are inevitable nowadays, OpenCloud makes use of SDS native supported storage aspects to build these advanced features in an efficient way.

In this talk we will give an overview of the storage aspects that are relevant from OpenClouds perspective, the integrations that we currently support as well as ongoing research topics.

2026-01-31T11:05:00+01:00

Ceph storage: Enterprise meets Community. Our traditional Ceph storage roadmap session starts with everything that is happening in the upstream project this year and what we have planned for the future, and closes with the state of what is backed by vendor-supported products. A 360-degree look at the state of Ceph integration with OpenStack and what is planned going forward in the broader storage space, in particular in regards to features relevant to container workloads.

Architectural familiarity with Ceph is required. This session contains zero vendor pitches, and it is a caffeinated tour of what the Ceph community is working on at the feature level. Hang on to your hats, and bring questions!

2026-01-31T11:40:00+01:00

Garage (project website) is a versatile object storage software, focused on decentralized and geo-distributed deployments. The software has been developed under the AGPL for more than 5 years and is now reaching maturity.

This talk will cover development and new features of the 2.x releases since the last FOSDEM talk (2024), best practices for administrators, available UIs, and a small tutorial on how to migrate from minio.

2026-01-31T12:15:00+01:00

The CernVM File System (CVMFS) is a scalable, high-performance distributed filesystem developed at CERN to efficiently deliver software and static data across global computing infrastructures, primarily designed for high-energy physics (HEP). For the Large Hadron Collider (LHC) only, CVMFS is serving around 4 billion files (~2PB of data). CVMFS uses a content-addressable storage model, where files are stored in the form of cryptographic hashes, ensuring integrity and enabling deduplication. It follows a multi-caching architecture where the data are published in a single source of truth (Stratum 0), mirrored by a network of distributed servers (Stratum 1), and propagated to the clients via forward proxies. This multi-layer of caching allows for a cost-effective alternative to traditional file systems, where clients are offered reliable access to versioned read-only datasets with low overhead. In this talk we will focus on how CVMFS interoperates with the highly adopted S3 storage, providing a conventional POSIX filesystem view of the objects, using the available metadata for efficient exploitation of the medium. We will also highlight the benefit of using CVMFS with containerized workflows and demonstrate tools developed to facilitate data publishing.

Homepage: https://cernvm.web.cern.ch/fs/
Documentation: https://cvmfs.readthedocs.io/
Development: https://github.com/cvmfs/cvmfs/
Forum: https://cernvm-forum.cern.ch/

2026-01-31T12:50:00+01:00

With cost and performance requirements becoming more and more relevant in today’s storage products, technologies that leverage algorithmic driven improvements are getting a lot of attention. Erasure coding is the most prominent algorithm and a meanwhile well established standard for saving on-disk space requirements in storage. It is built upon mathematical techniques. In my talk I want to explain and explore these techniques, and thereby the mathematical reasoning underlying these algorithms in a way that does not require a background in mathematics itself (or at least only an insignificant amount). I am not a software engineer myself, just an interested mathematics student who aims to introduce someone who is interested and not too fond of maths to the underlying theory of erasure coding.

2026-01-31T13:25:00+01:00

Have you ever found your CephFS setup mysteriously broken and had no clue how it got there? Maybe someone ran a CLI command in haste, or a misstep happened weeks ago. We have suspicions, but can’t really recall what might've splintered the system. That changes now.

In this talk, we introduce a robust command history logging mechanism for CephFS: a persistent log of CephFS commands and standalone tool invocations, backed by LibCephSQLite. Think of it as “shell history,” but purpose-built for Ceph with time ranges, filters, and structured metadata. Every ceph fs subvolume rm, every ceph config set, every mischievous --force — now recorded, timestamped, and queryable.

Want to know what was run last Tuesday at 3 AM? Or who triggered that well-intentioned-but-catastrophic disaster recovery script? Or just list the last 100 commands before things exploded? It’s all there. This helps debug incidents faster, provides a clear audit trail, and opens the door to proactive traceability. So, when things go sideways around CephFS and no one's sure why — this history has your back.

This is CephFS-first but not CephFS-only. The path to full cluster command traceability starts here.

2026-01-31T14:00:00+01:00

Starting with the Tentacle release, Ceph introduces mgmt-gateway: a modular, nginx-based service that provides a secure, highly available entry point to the entire management and monitoring stack. This talk will cover its architecture and deployment, how it centralizes access to the dashboard and observability tools, and how OIDC-based Single Sign-On streamlines authentication. We’ll also show how mgmt-gateway enhances security and access control while delivering full HA for Prometheus, Grafana, Alertmanager, and the dashboard, resulting in a more resilient and user-friendly experience for Ceph administrators.

2026-01-31T14:35:00+01:00

The purpose of this talk is to highlight how LUA scripting and S3 Lifecycle Policies can be leveraged to enable Ceph S3's dynamic placement and cost-efficient, policy-driven data retention. All details on https://github.com/frednass/s3-dynamic-placement-and-archiving

2026-01-31T15:10:00+01:00

The CERN Tape Archive (CTA) is the open source solution developed at CERN to store more than 1 Exabyte of data from CERN’s experimental programmes. CTA interfaces with two disk systems widely used by the High-Energy Physics (HEP) community, EOS and dCache. However, until now there has been no integration with systems used outside of HEP.

Looking at current industry standards, the leading interface for file and object storage is S3, which includes cold storage extensions for data archival. The CTA team are investigating whether CTA can be fronted by an S3 API. During this talk, we’ll review a proof-of-concept implementation, and look at alternative solutions to explore along with their respective trade-offs.

2026-01-31T15:45:00+01:00

Umbrella ("U") is planned as the next major release for the Ceph Distributed Storage System open-source project. Ceph File System development in Umbrella is aimed at addressing various pain points around the file system disaster recovery process, performance metrics, MDS tuning, user data protection and backups. Many of these themes were also discussed in the Cephalocon 2024 and various user/dev meetings.

This talk details improvements in each of those areas with a specific focus on ease of use and automation. Many noteworthy features have been introduced thereby improving the user experience across the board. Umbrella release aims to provide Ceph File System users and administrators a better and smoother experience.

2026-01-31T16:20:00+01:00

Concurrent storage access via standard network protocols such as SMB and NFS has become a common feature of many proprietary storage products. Samba, the leading open‑source SMB implementation, has long supported a limited set of multiprotocol scenarios by leveraging kernel interfaces and by allowing aspects of multiprotocol access to be implemented in the filesystem. Over time, several storage vendors have exploited these capabilities while using their own proprietary filesystems.

In this talk we will present our plan for a fully open‑source multiprotocol stack built on CephFS, Samba, and NFS‑Ganesha. First, we will describe the testing infrastructure we are creating and the use‑cases we intend to support in the initial release. We will then outline our approach to exclusive file locking and to a unified access‑control model.

2026-01-31T16:55:00+01:00

This talk introduces an advanced storage acceleration strategy for I/O-intensive container workloads. In environments like CI/CD pipelines or database applications, performance is often constrained by storage latency. Our plan addresses this by implementing a transparent data caching layer that uses high-speed local storage to hold frequently accessed data, significantly reducing retrieval times and load on the primary storage system.

With a core focus on disaster recovery and fast StatefulSet failover, the primary cloud storage volume is intentionally left pristine and unmodified, containing solely user data All cache intelligence is kept local to the node. This design is critical for operational robustness, as it ensures the data can be restored to a consistent point in time, a fundamental requirement for reliable disaster recovery This allows the volume to be safely attached to any node for rapid failover maximizing both performance and data safety.

project: https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver

2026-01-31T17:30:00+01:00

For high-performance proxy services, moving data is the primary bottleneck. Whether it is an NFS-Ganesha server or a FUSE-based Ceph client, the application burns CPU cycles copying payloads between kernel and user space just to route traffic. While splice() exists, it imposes a rigid pipe-based architecture that is difficult to integrate into modern asynchronous event loops.

We propose a pure software zero-copy design that works with standard network stacks. In this model, a specialized kernel socket aggregates incoming network packets into a scatter-gather list. Instead of copying this data to the application, the kernel notifies userspace—potentially via io_uring—that a new data segment is ready and provides an opaque handle.

The application sees the headers to make logic decisions but acts only as a traffic controller for the payload. It uses the handle to forward the data to an egress socket or a driver like FUSE without ever touching the actual bytes. This talk will outline the design of this buffer-handling mechanism and demonstrate how it allows complex proxies like Ganesha and storage clients like Ceph to achieve true zero-copy throughput on standard hardware.

2026-01-31T18:05:00+01:00

Ad hoc lightning talks. Every speaker gets exactly 5 minutes (or less).

Sign up here! Or scan the QR-code to open the link.