Show HN: Diffulex – Unified serving engine for diffusion language models
Category: infrastructure
Tags: diffusion-language-models, serving-engine, deep-learning
Score: 7.3/10 (Innovation: 7, Technical: 8, Documentation: 7, Utility: 7)
Diffulex is a unified serving engine for diffusion language models that decouples decoding strategies from runtime infrastructure using a Block-Buffer-Request hierarchy. It supports multiple state-of-the-art dLLM strategies and models, achieving competitive throughput while enabling CUDA Graph-friendly static-shape execution.
Target audience: AI researchers, ML engineers, backend devs
Repository: https://github.com/SJTU-DENG-Lab/mbd-lms · Python · MIT · 13 stars
View on Hacker News