Show HN: IgniteMS – batch text embeddings at 253K msg/s on 8x A100
Category: library
Tags: embeddings, tensorrt, rust, batch-inference, gpu
Score: 7.8/10 (Innovation: 7, Technical: 9, Documentation: 7, Utility: 8)
IgniteMS is a high-performance batch text embedding engine written in Rust that uses native TensorRT to achieve up to 3x faster throughput than Hugging Face TEI on the same hardware. It's interesting for its practical engineering optimizations like bucketed batching and multi-GPU coordination in a single process, filling a clear niche for large-scale vector database reindexing and corpus processing.
Target audience: data engineers, ml engineers, backend devs
Repository: https://github.com/Artain-AI/ignite-ms · Rust · Apache-2.0
View on Hacker News