eBPF programs often behave differently than developers expect, not because of incorrect logic, but because of subtle behaviours of the hookpoints themselves. In this talk, we focus on a small set of high-impact, commonly misunderstood attachment types — kprobes/fentry, tracepoints, uprobes, and TC/XDP and expose the internal kernel mechanics that cause surprising edge cases.
Rather than attempting to cover all eBPF hooks, this session distills a practical set of real-world gotchas that routinely affect production tools, explaining why they occur and how to work around them.
Training large models requires significant resources and failure of any GPU or Host can significantly prolong training times. At Meta, we observed that 17% of our jobs fail due to RDMA-related syscall errors which arise due to bugs in the RDMA driver code. Unlike other parts of the Kernel RDMA-related syscalls are opaque and the errors create a mismatched application/kernel view of hardware resources. As a result of this opacity and mismatch existing observability tools provided limited visibility and DevOps found it challenging to triage – we required a new scalable framework to analyze kernel state and identify the cause of this mismatch.
Direct approaches like tracing the kernel calls and capturing meta involved in the systems turned out to be prohibitively expensive. In this talk, we will describe the set of optimizations used to scale tracking kernel state and the map-based systems designed to efficiently export relevant state without impacting production workloads.
Any eBPF project that has started in the last couple of years is most likely written to take advantage of CO-RE, compiling your eBPF programs ahead of time, and being able to run that program on a wide range of kernels and machines.
Before CO-RE it was common to ship the whole toolchain and compile on target. This is what Cillium currently still does. Compiling on target empowered a core value of Cilium: "you do not pay for what you do not use". But it turns out that with CO-RE sometimes you DO pay for what you do not use, which makes it painful to switch over.
This payment mostly comes in the from of unused maps which still have to be created and loading tail calls which will never be called.
We created what we call "reachability analysis" which allows us to predict in userspace which parts of an eBPF program will be unused when loaded with a given set of global constants. This allows us to avoid creating maps that will never be used or load tail calls that will never be called, opening the way for Cilium migration to CO-RE.
I would like to show how this works.
This talk will go over a number of performance and reliability pitfalls of the different eBPF program types we have discovered building production ready eBPF based products at Datadog. This includes the changing performance characteristics of kprobes over different kernel versions, reliability issues with fentry due to a kernel bug, and the pains of scaling uprobes, among other things.
OOMProf is a Go library that installs a eBPF programs that listen to Linux kernel tracepoints involved in OOM killing and records a memory profile before your Go program is dead and gone. The memory profile can be logged as a pprof file or sent to a Parca server for storage and analysis. This talk will be a deep dive into the implementation and its limitations and possible future directions.
XDP and AF_XDP provide a high-performance mechanism for driver-layer packet processing and zero-copy delivery of packets into userspace, while maintaining access to standard kernel networking constructs — capabilities that distinguish them from full kernel-bypass frameworks such as DPDK. However, the current AF_XDP implementation offers no efficient in-kernel mechanism for forwarding packets between AF_XDP sockets in a zero-copy manner. Because AF_XDP operates without the conventional full network stack socket abstraction, even basic localhost redirection requires an external switch or additional hardware-assisted NIC capabilities, limiting both performance and usability.
In this talk, we introduce FLASH, an extension to the AF_XDP subsystem that enables low-overhead, in-kernel packet transfer between AF_XDP sockets. FLASH provides zero-copy delivery for sockets that share a memory area and a fast single-copy datapath for sockets backed by independent memories. The design incorporates several performance-oriented mechanisms, including smart blocking with backpressure for congestion handling and an adaptive interrupt-to-busypoll transition to reduce latency under load.
We demonstrate that co-located applications using AF_XDP can leverage FLASH to achieve up to 2.5× higher throughput compared to SR-IOV-based approaches, while preserving the programming model and flexibility of the XDP/AF_XDP ecosystem. The talk will also outline future directions and how this technique can be applied across different use cases like DDoS mitigation, load balancing, etc.
Github link - https://github.com/networkedsystemsIITB/flash
The eBPF eXpress Data Path (XDP) allows high-speed packet processing applications. Achieving high throughput requires careful design and profiling of XDP applications. However, existing profiling tools lack eBPF support. We introduce InXpect, a lightweight monitoring framework that profiles eBPF programs with fine granularity and minimal overhead, making it suitable for XDP-based in-production systems. We demonstrate how InXpect outperforms existing tools in profiling overhead and capabilities. InXpect is the first XDP/eBPF profiling system that provides real-time statistics streaming, enabling immediate detection of changes in program behavior
Faced with the looming retirement of our traditional load balancer appliances we decided to give XDP a try. Facebook's Katran library did not support layer 2 switching, which was still a requirement, so we built an eBPF application in C and a supporting library with Golang.
We came across a few issues along the way - driver support for network cards gave me headaches - but on the whole eBPF has made what would have been practically unthinkable a few years ago into a relatively straightforward task.
The library is used by an application which adds configuration management, BGP, metrics, etc., and after testing on smaller services for some time the balancer now handles streaming audio and website content for the UK's largest commercial radio broadcaster, delivering tens of gigabits per second to our audience. COTS servers handle high volumes of traffic and can be trivially scaled/migrated when updated hardware comes along as simply as running an Ansible job.
The interoperability of I/O monitoring and profiling tools is very limited due to their strong dependence on the underlying file system (LUSTRE, Spectrum Scale, NFS, etc) and resource managers (batch jobs, VMs, containerized workloads, etc). Widely adopted generic monitoring tools often lack the temporal information of the I/O activity which is often required to understand the I/O behavior of the applications. The increasing diversity of applications and computing platforms demands greater flexibility and scope in I/O characterization. This talk proposes a framework for monitoring I/O activity using extended Berkley Packet Filter (eBPF) technology which has gained much traction in observability and cloud-native landscape. By tracing the kernel’s Virtual File System (VFS) functions with eBPF, it is possible to monitor the I/O activity on different types of platforms like HPC, cloud hypervisors or Kubernetes. By storing the metrics traced by eBPF programs in a high performance time series database like Prometheus, it is possible to perform system-wide monitoring of computing platforms that use different types of local or remote file systems in a unified manner. The current talk presents the basics of eBPF and discusses the framework that is used to monitor I/O activity in a file system and application agnostic way. It also presents the experimental results of quantifying the overhead and accuracy of the proposed framework using IOR benchmark results as the reference. The results indicate that there is negligible overhead in using the framework and bandwidths reported by the proposed methodology are in a very good agreement with the ones from IOR tests. Finally, results from a production HPC platform that uses the proposed framework to monitor I/O activity on the LUSTRE file system are presented.
When it comes to string handling, C is not the most ergonomic language for the job. But, at least, the standard library provides a basic set of functions for finding characters, comparing strings, or finding sub-strings. The same has not been true for eBPF programs where developers had to implement all the operations manually by looking into individual bytes.
This has changed in kernel version 6.17, which added a set of eBPF kernel functions (so-called kfuncs) to perform the most common string processing operations. While implementing these sounded like a very straightforward job at the beginning (just call the in-kernel implementations of the respective functions, right?), it turned out that eBPF programs have a number of specifics which required the implementation to be much more complicated.
In this talk, I will walk you through this journey and show how and why the kfuncs have to be implemented differently from the in-kernel implementations. We'll also dive into the API specifics and demonstrate how string kfuncs have been adopted by bpftrace [1] and what benefits they brought.
[1] https://bpftrace.org/
This talk aims to present the first major release of PythonBPF and how other developers can start using it. The speakers will discuss the progress of this project since it was demoed at LPC 2025 in December 2025 (what actions were taken on the feedback gathered at LPC).
PythonBPF is a project that enables developers to write eBPF programs in pure Python. We allow a reduced Python grammar to be used for the eBPF-specific parts of code. This allows users to: - Write both eBPF logic and userspace code in Python (and can be in the same file), so the Python dev-tools apply to the whole file instead of just the non-BPF parts. - Process eBPF data and visualize it using Python's ecosystem, and interactively develop and debug eBPF programs using Python notebooks.
eBPF is a powerful technology, but it is often hard to use, because its toolchain is non-trivial. In my talk I present EBPFCat, a pure Python library that can generate eBPF directly without any dependency on other code beyond the Linux kernel. Unlike most eBPF implementations, no compiler is involved. Instead, the user writes Python code which generates eBPF on-the-fly at runtime.
In EBPFCat, user- and kernel space are tightly integrated, so that both eBPF and Python code can access the same data structures. This way, one can use eBPF to write the performance critical parts of a program, while retaining the versatility of Python for the bulk of the code. This opens eBPF to a large audience who are interested in the performance boost by eBPF, but are hesitant to learn an entirely new tool set for it.
I will go through a simple example to show that using EBPFCat it is possible to fit an entire eBPF program including its user space counterpart on a single presentation slide. For this example I also show how EBPFCat generates the eBPF bytecode.
While EBPFCat can be employed for usual eBPF use cases, I present one well beyond typical systems-level applications: motion control. EtherCAT is a standard field bus protocol based on EtherNet. Using eBPF we can reduce the latency of communication with EtherCAT devices, allowing for real-time performance. I will show a real-world combined motion system for physics research that routinely uses EBPFCat.
Developers new to eBPF will get a jump start into everything needed for their first project, while experienced kernel developers will be surprised just how far eBPF can take you beyond system programming.
Aya is a library that allows writing eBPF programs, as well as their user-space counterparts, entirely in Rust. It has been presented in previous editions of FOSDEM, but the project has evolved since then. In this talk, we will highlight what has changed, what’s coming next, and how these developments shape the Rust-and-eBPF ecosystem.
Over the last year, Aya has gained support for several new eBPF program types, as well as additional map types, such as the family of storage maps (sk_storage, task_storage, inode_storage). We’ve worked with the LLVM community to enable BTF generation for Rust eBPF programs. The overall developer experience has continued to improve, and an increasing number of open-source projects are now building on top of Aya.
We will also share updates on our ongoing work, most notably our efforts to promote Rust’s eBPF targets to Tier 2, paving the way for building eBPF programs on Rust stable without requiring nightly toolchains. Alongside this, we are developing support for BTF relocation emission, refining the user-space XDP API, and broadening coverage of program and map types.
The talk will dive into the technical details behind these features, the architectural decisions that shaped them, and the challenges ahead. We will conclude with our vision for Aya’s future and how we see it moving forward.
eBPF powers modern observability, but its behavior varies significantly across architectures. This talk examines whether eBPF can be used reliably on RISC-class systems—ARM64 and RISC-V—and what limitations appear in real workloads.
We use reproducible test environments to run tracing, profiling, and networking eBPF tools on x86_64, ARM64, and RISC-V, revealing practical differences in verifier constraints, helper availability, JIT maturity, and performance overhead. RISC-V support exists but remains incomplete, and we show exactly which features succeed, fail, or behave unpredictably.
Using a database benchmark as a workload generator, we compare instrumentation accuracy, latency impact, and stability across architectures. Attendees gain a clear understanding of eBPF’s practical portability and how to build a realistic multi-architecture observability testbed.
BPF Tokens are a new Linux kernel mechanism for delegating restricted eBPF privileges to unprivileged processes. This talk explains how distributions can adopt them to provide safer access to tracing, observability, and networking tools—without granting root or CAP_SYS_ADMIN.
We’ll show how token-based delegation could reshape developer workflows, container runtimes, and system services in Fedora or other distros.
The session includes a walkthrough of real token policies and discusses how distributions can help build a secure, less-privileged eBPF ecosystem.