Show HN: Infini-News – 1.36B news articles from Common Crawl, queryable in ms
Category: infrastructure
Tags: news-articles, common-crawl, search-engine
Score: 6.7/10 (Innovation: 6, Technical: 7, Documentation: 3, Utility: 7)
Infini-News provides a massive dataset of 1.36 billion news articles extracted from Common Crawl, queryable in milliseconds. It leverages a custom indexing approach for ultra-low latency search, making it interesting for large-scale information retrieval and news analysis.
Target audience: data engineers, researchers, backend devs
Repository: https://cs2.uni-graz.at/blog/infini-news/
View on Hacker News