Show HN: Open-source general-purpose alternative to Exa Websets
Category: devtools
Tags: web-scraping, ai-agents, data-extraction
Score: 6.8/10 (Innovation: 6, Technical: 7, Documentation: 7, Utility: 7)
BigSet is an open-source tool that lets users describe a dataset in plain English and have AI agents autonomously research, extract, verify, and compile structured data from the live web, with scheduled refreshes. It combines AI-driven schema inference, parallel web agents, and automated deduplication to turn natural language queries into downloadable CSV or XLSX datasets, offering a novel approach to web data extraction without manual scraping.
Target audience: backend devs, data engineers, ai researchers
Repository: https://github.com/tinyfish-io/bigset · TypeScript · AGPL-3.0 · 80 stars
View on Hacker News