"GPU clouds" for AI application are the hot topic at the moment, but often these either end up being just big traditional HPC-style cluster deployments instead of actual cloud infrastructure or are built in secrecy by hyperscalers.
In this talk, we'll explore what makes a "GPU cloud" an actual cloud, how requirements differ from traditional cloud infrastructure, and most importantly, how you can build your own using open source technology - all the way from hardware selection (do you really need to buy the six-figures boxes?) over firmware (OpenBMC), networking (SONiC, VPP), storage (Ceph, SPDK), orchestration (K8s, but not the way you think), OS deployment (mkosi, UEFI HTTP netboot), virtualization (QEMU, vhost-user), performance tuning (NUMA, RDMA) to various managed services (load balancing, API gateways, Slurm etc.
In addition to the purely technical side, we'll also go into some of the non-technical challenges of actually running your own infrastructure and how to decide whether this is something that's actually worth doing yourself.