Show HN: I built a 2nd-order PyTorch optimizer for LLMs that runs on 16GB GPUs
Category: library
Tags: pytorch, optimizer, llm, second-order, deep-learning, training
Score: 7.5/10 (Innovation: 8, Technical: 8, Documentation: 7, Utility: 7)
SCAO is a second-order PyTorch optimizer that approximates Shampoo-level curvature information using low-rank Kronecker factors, achieving near-AdamW memory usage and throughput. It is interesting because it brings practical second-order optimization to consumer GPUs (16GB) with features like adaptive rank selection, int8 quantization, and CUDA fused kernels, targeting LLM training.
Target audience: backend devs, ai-ml engineers, data scientists
Repository: https://github.com/whispering3/scao · Python · NOASSERTION · 6 stars
View on Hacker News