Show HN: NeuroFlow 55.8x video inference speedup for Vision Transformers PyTorch

Category: library

Tags: vision-transformer, video-inference, token-compression

Score: 7.3/10 (Innovation: 7, Technical: 8, Documentation: 8, Utility: 6)

NeuroFlow is a dynamic routing framework for Vision Transformers that uses EMA-gated temporal sequence compression to dramatically accelerate video inference by skipping redundant background tokens before the encoder. It achieves up to 55.8x speedup on high-resolution video while retaining high embedding fidelity, offering both training-free and fine-tuned architectures. Its combination of token sparsity with a novel dual-memory reconstruction approach is a compelling innovation for real-time vision applications.

Target audience: computer vision researchers, ML engineers, backend devs

Repository: https://github.com/ynnk-research/-NeuroFlow · Python · Apache-2.0 · 18 stars

View on Hacker News