Open Media devroom Track

Upstreaming Progress: Video Capture and Camera Support for Recent Rockchip SoCs

2026-01-31T10:30:00+01:00

Recent Rockchip SoCs (namely, those of the RK35 generation) integrate dedicated IP blocks for video capture and image signal processing. Yet support for these blocks in upstream Linux remains one of the last missing pieces in an otherwise well-supported SoC lineup.

This talk will begin with an overview of the contributions that have already landed in mainline, provide an update on the change sets that are currently in flight, and outline the remaining work needed to fully enable video capture and camera functionality on RK35xy SoCs.

Latency reduction in Video streaming with Linux’s camera and encoder APIs

2026-01-31T10:50:00+01:00

This talk describes our in-race-car video camera hardware and the open-source software that underpins our sub 200ms Glass to Glass streaming.

We will discuss interfacing to V4l2 (in various modes) from memory safe languages (that aren’t C) and also the problems and advantages of accessing a chip specific encoder API.   I will have a solid grumble about the increasing complexity and opacity of the linux media APIs and a moan about how much I miss Plan9 style thinking.

We will use examples from the following open source projects (among others) https://github.com/pipe/whipi https://github.com/pipe/v4l2Reader https://github.com/steely-glint/PhonoSDK

WebRTC support in WebKitGTK and WPEWebKit with GStreamer: Current status and plans

2026-01-31T11:10:00+01:00

The WebKit WPE and GTK ports are aiming to leverage GstWebRTC as their WebRTC backend. Over the years we have made progress towards this goal both in WebKit and in GStreamer. During this talk we will present the current integration status of GstWebRTC in WebKit, the achievements recently accomplished and the plans for the coming months.

Innovations with YAML/CABAC in H.264/AVC software decoding

2026-01-31T11:35:00+01:00

This talk will present a range of unusual programming techniques that were used in the development of a state-of-the-art H.264 software decoder (https://github.com/tvlabs/edge264), to drastically reduce code and binary size and improve speed. The techniques are applicable to other audio/video codecs, and will be presented as HOWTOs to help participants use them in their projects. It complements my talks from the last 2 years at FOSDEM, and will focus this time on (i) using YAML output as a cornerstone for bug-hunting and data-analysis, and (ii) optimizing the infamous CABAC serial arithmetic decoder.

Bridging the gap between browser and backend media processing

2026-01-31T11:55:00+01:00

The global software ecosystem has moved to richer and richer web experience. With the addition of A/V APIs, webgl acceleration, rich media APIs, RTC and, more recently, the wide open field of web assembly-supported features, more or and more of the typical user interaction and applications happens within the browser.

However, not all processing is meant to happen browser-side. In particular, when dealing with media with potentially large resolutions, exotic formats or complex compute-heavy effects, to provide a full user experience, it might be required to move back and forth between the browser and a backend server processing.

But this comes with its own sets of challenges: what kind of processing is well suited in the browser? How to best interface a browser-based API with a backend-end based one? How is it possible to transpose a user experience that is built on javascript APIs available in the browser to a backend-based processing where these APIs typically have no bearing.

In this talk, which is based on some of the challenges faced when building Descript, a feature-rich web-based video editor, we will review some of the technologies that are available to help interfacing web and backend processing, illustrate some of the challenges that these pose and also solve and explore potential future solutions using recent or prospective technologies.

Update on FFmpeg, VLC, related libraries and Kyber

2026-01-31T12:20:00+01:00

In this talk, I will pass in review about what happened in the VideoLAN and FFmpeg communities about VLC, FFmpeg, x264, dav1d, dav2d, checkasm, libplacebo and libspatialaudio.

And a bit of Kyber :)

All in one short talk :)

Decentralized Public Broadcast with Streamplace

2026-01-31T12:40:00+01:00

Streamplace (https://stream.place) has been spending the last two years developing a novel form of decentralized public broadcast to facilitate a live video layer for Bluesky's AT Protocol. The basis of this system are C2PA-signed one-second MP4 files that can be deterministically muxed together into larger segments for archival. This talk will give a technical overview of how all the pieces fit together and show off how our freely-licensed media server facilitates cooperative livestreaming infrastructure.

Enabling Intelligent Media Playback on RISC-V: VLC with Whisper STT and Qwen T2T on Next-Gen RISC-V AI PCs

2026-01-31T13:05:00+01:00

This joint talk by DeepComputing and contributors from the VLC project showcases how intelligent media playback and real-time audio processing are becoming a reality on open RISC-V hardware. We demonstrate VLC running Whisper (speech-to-text) and Qwen (text-to-text LLM) on ESWIN’s EIC7702 SoC with a 40-TOPS NPU, achieving practical AI-enhanced multimedia performance entirely on RISC-V. We will walk through the porting process, performance tuning across CPU/NPU, audio pipeline integration, and the technical challenges of enabling real-time inference on today’s RISC-V AI PCs. The session will also preview our upcoming 16-core RISC-V platform and discuss how VLC’s evolving AI support roadmap aligns with this next generation of RISC-V hardware. Together, we outline the upstreaming efforts required to bring AI-accelerated playback, real-time captioning, translation, and other intelligent media features to the broader open-source community.

Machine Learning in GStreamer: Frameworks, Tensors, and Analytics

2026-01-31T13:25:00+01:00

Machine learning in GStreamer is evolving rapidly, with major recent advances such as a dedicated analytics framework in the core library and new elements for integrating popular ML runtimes. These improvements further solidify GStreamer’s position as a leading open source multimedia framework for building robust, cross-platform media analytics pipelines. In this talk, we’ll explore the latest developments, including the GStAnalytics library, ONNX support, Python integration via gst-python-ml, new Tensor negotiation capabilities, and more.

imquic, a QUIC library for real-time media

2026-01-31T13:50:00+01:00

After spending the past 10 years (and more!) working with WebRTC, and even more than that with SIP/RTP, I decided to have a look at the efforts happening within the standardization community on how to leverage QUIC for real-time media. This led me to studying not only QUIC itself, but also RTP Over QUIC (RoQ) and Media Over QUIC (MoQT).

As part of my learning process, I started writing a QUIC library, called imquic. While it can (mostly) be used as a generic QUIC/WebTransport library, I also implemented native support within the library for both RoQ and MoQT, as a testbed to use for prototyping the new protocols in an experimental way. This presentation will introduce these new protocols and the imquic library implementing them, talking a bit about the existing demos and the proof-of-concept integration in the Janus WebRTC Server for QUIC-to-WebRTC translation.

Open Source video mixing and syncing with real-time control

2026-01-31T14:10:00+01:00

As of the 3.10 release, the public domain (Unlicense) media server MistServer (https://mistserver.org) gained a new feature: the ability to mix raw (UYVY pixel format only, for now) video streams, raw audio streams (PCM) and PNG images with resizing, overlapping, aspect ratio keeping and support for non-uniform frame rates between sources. Not only that - but it's even possible to control the configuration in real time without any downtime. This talk shows off what is possible, and explains how we did it in technical detail.

Covered topics:

How to efficiently store a multi-frame raw video buffer in shared memory
Synchronization handling between multiple sources
Handling sources being added or removed without interruptions
How we implemented decoding and encoding between raw and encoded formats
The user interface that was built to control the mixing in a user-friendly way (though "raw" control through JSON is also possible)