Sparse convolution is a key building block for 3D perception and generative models, yet existing libraries face limitations in performance and flexibility. FlexGEMM is a Triton-powered GEMM backend tailored for 3D sparse convolutions. It delivers significant speedups—up to ~2× faster compared to prior libraries. In this post, I’ll walk through the What, Why, and How behind FlexGEMM.