Show HN: A book that builds GPT-2, Llama 3, DeepSeek from scratch in PyTorch
Category: ai-ml
Tags: llm, pytorch, educational, transformer, deep-learning
Score: 7.0/10 (Innovation: 6, Technical: 8, Documentation: 7, Utility: 7)
A book and companion code repository that teaches how to build five modern LLM architectures (Transformer, GPT-2, Llama 3.2, DeepSeek) from scratch in PyTorch, with runnable implementations that can load real pretrained weights. It's interesting because it goes beyond typical educational resources by covering production-ready architectures like Llama 3.2 and DeepSeek with their advanced optimizations.
Target audience: data engineers, ml engineers, researchers, senior developers
Repository: https://github.com/S1LV3RJ1NX/mal-code · Python · Apache-2.0
View on Hacker News