Show HN: INT21 – Self-Improving PTX Kernel Factory
Category: infrastructure
Tags: gpu-kernels, cuda, ptx, ai-infrastructure, nvidia-blackwell
Score: 8.0/10 (Innovation: 8, Technical: 9, Documentation: 8, Utility: 7)
INT21's PTX Kernel Factory generates and optimizes low-level NVIDIA GPU kernels using AI agent swarms, producing implementations like this Kimi Delta Attention kernel that outperforms CUTLASS FlashKDA by 1.4-1.5x on Blackwell GPUs. It combines self-improving AI with expert-level PTX programming, potentially automating a rare and difficult skill in GPU software optimization.
Target audience: backend devs, devops, data engineers
Repository: https://int21.ai/insights/introducing-int21-and-ptx-kernel-factory/ · Cuda · MIT
View on Hacker News