OctonData Document Processing API

Extract structured data from documents with high accuracy

Quick Start: Which Endpoint Should I Use?

Use Case	Endpoint	Output
Get document text/markdown	`POST /api/v1/parse`	Markdown + chunks
Extract form data (invoices, IDs, tax forms)	`POST /api/v1/extract`	Structured JSON fields
Process large files or batches	`POST /api/v1/extract/async`	Job ID (poll for results)

Available Endpoints

POST/api/v1/parse

Convert documents to clean Markdown with structure. Best for full text extraction and search indexing. Returns content in chunks with page coordinates.

POST/api/v1/extract

Extract structured data from forms, invoices, IDs, and tax documents. Returns field-level data with confidence scores. Auto-detects document type.

POST/api/v1/extract/async

Background processing for large files. Submit a job and poll for results or receive webhook notification on completion.

POST/api/v1/upload/request

Get a signed URL for uploading files larger than 100MB. Returns document_id for processing.

GET/health

Health check endpoint. Returns service status and component health.

Documentation & Resources

Interactive API Docs Swagger UI with try-it-now feature Reference Documentation Clean API reference with schemas Health Check Service status and diagnostics