Show HN: Extract (YC P25) – Fast, accurate document parsing
Category: infrastructure
Tags: document-parsing, ocr, api, healthcare, data-extraction
Score: 6.7/10 (Innovation: 5, Technical: 7, Documentation: 7, Utility: 8)
Extract is a commercial API for parsing documents (PDF, DOCX, PPTX) into structured data with text, tables, figures, and per-span bounding boxes and OCR confidence. It's interesting because it offers fast, accurate parsing with healthcare compliance (HIPAA/BAA) and claims to outperform AWS Textract and LlamaParse on accuracy and latency.
Target audience: backend devs, data engineers
Repository: https://extract.page
View on Hacker News