Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch
Category: ai-ml
Tags: gpt-2, cuda, transformer, from-scratch, language-model
Score: 7.5/10 (Innovation: 7, Technical: 9, Documentation: 8, Utility: 6)
NanoEuler is a from-scratch GPT-2-scale language model implementation in pure C/CUDA, featuring hand-written forward/backward passes, a byte-level BPE tokenizer, FlashAttention, and a full training pipeline from pretraining to supervised fine-tuning. Its educational value is exceptional for understanding modern transformer internals, though the resulting model is too small for practical use.
Target audience: machine learning engineers, AI researchers, systems programmers
Repository: https://github.com/JustVugg/nanoeuler · Cuda · MIT · 7 stars
View on Hacker News