Show HN: GPT-2 inference in pure C#, 0 bytes allocated per token
Category: library
Tags: deep-learning, inference-engine, csharp, gpt-2, zero-allocation, onnx
Score: 7.8/10 (Innovation: 8, Technical: 9, Documentation: 7, Utility: 7)
Overfit is a pure C# deep learning and inference engine that achieves zero-allocation, CPU-only inference for models like GPT-2, with competitive performance against ONNX Runtime. It combines evolutionary optimization, ONNX import, and explicit memory ownership, making it interesting for .NET environments seeking to avoid native dependencies and Python runtime.
Target audience: backend devs
Repository: https://github.com/DevOnBike/Overfit · C# · NOASSERTION · 7 stars
View on Hacker News