Show HN: CLI for crawling documentation sites into Markdown with defuddle
Category: cli-tool
Tags: cli, markdown, crawler, documentation, rag
Score: 5.5/10 (Innovation: 4, Technical: 5, Documentation: 7, Utility: 6)
Docrawl is a Node.js CLI that crawls static documentation sites and converts pages to Markdown using defuddle, making it easy to ingest doc content into LLM contexts or RAG pipelines. Its approach of avoiding a browser for lightweight, fast crawling is a practical incremental improvement over existing scraping tools. The project is early-stage with clear use cases but limited scope and features.
Target audience: backend devs, data engineers, ai-ml engineers
Repository: https://github.com/artemnistuley/docrawl · TypeScript · MIT · 3 stars
View on Hacker News