AI OCR

AI OCR that you don't have to take on faith

space-ocr uses an LLM to structure your documents, then validates every value against the real OCR symbols on the page and scores each one with a match_ratio.

AI OCR sounds like the answer to messy documents: hand a receipt or an invoice to a model and get clean, structured fields back. The trouble is what happens when the model is wrong. A language model will return a confident, well-formatted value whether or not it actually read it off the page, and most tools hand you that value with no way to tell the difference.

space-ocr takes a stricter line. An LLM does the structuring, but it doesn't get the final word. The model returns each value plus the word-token ids it thinks it used; the engine then character-matches that value against the symbols Google Vision actually detected on the page, locates it with a box, and scores how well it matched. So the AI is part of the pipeline, not the judge of it. You can check every value it produced.

See the AI's output, checked

Hover any field below — the box on the receipt is where that value was actually found on the page, not where the model claimed it was. Every value, box, and match score here is read straight from a real parsed result, not a mockup.

Receipts with extracted-field bounding boxes

Verified fields

KINSHO · 合計 2,045

ライフ · 合計 4,286

Each value with a box carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.

AI output, validated against the page

The LLM returns each value and the word-token ids it used — never coordinates. CharMatcher runs first and matches that value, character by character, against the symbols Vision actually detected.

Every value located and scored

Each field comes back with a bounding box (xmin/ymin/xmax/ymax on a 0–1000 grid), four oriented vertices, and a match_ratio. 0.85 or higher is a confident match; 1.0 means every character was found.

Templates or auto-fields, no schema

Apply a built-in templateId like receipt or invoice, define your own fields, or set autoFields and let the model propose the schema. No schema to write for common documents.

Audit trail: original vs edited

When you correct a cell, the edit is stored beside the original OCR value rather than overwriting it — so what the AI read and what a human changed both stay on record.

Line items the model can't fake

Repeated-value columns are checked for column and row consistency instead of being trusted blindly, so a model that swaps two rows gets caught, not exported.

Languages on autopilot

Japanese, Korean, Chinese, and English in one engine, mixed scripts handled — no language hint to set. The model and the matcher both work across scripts.

How AI OCR works in space-ocr

Upload an image and an LLM reads the document into structured fields, returning each value with the word-token ids it used. Before that ever reaches you, CharMatcher takes the value and matches its characters against the symbols Google Vision detected on the page, producing the box, the oriented vertices, and the match_ratio. If the model supplied token ids, the engine looks up those Vision word boxes and can override a field's source to token_id — but for repeated-value columns it leans on column clustering and row consistency, because a model's token hints can be wrong there.

You don't have to write a schema. Pass a built-in templateId like receipt or invoice, define your own fields, or set autoFields and let the model suggest the structure. The web app rasterizes PDFs page by page first; the public API takes raster images directly (JPEG, PNG, GIF, BMP, TIFF, WebP).

structure a document, with every value checked

curl -s https://api.space-ocr.com/ocr/fields \
  -H "Authorization: Bearer $SPACE_OCR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "https://example.com/receipt.jpg",
    "imageType": "url",
    "templateId": "receipt"
  }'

How to run AI OCR you can verify

Send a document
Upload an image to /ocr/fields (url or base64). In the app you can drop a PDF and each page is rasterized first; the public API takes raster images.
Let the AI structure it
Pass a built-in templateId, define your own fields, or set autoFields so the model proposes the schema. The LLM returns each value plus the word-token ids it used.
Read the checked result
Each value comes back with its bbox, vertices, match_ratio, and bbox_source, plus a field_bboxes map locating every field — the AI's output validated against the page.
Verify the low scores
Click a cell to highlight the exact region it was read from. A match_ratio below 0.85 marks a value worth a second look; your edit is stored beside the original OCR value.
Export or query
Download CSV (UTF-8 BOM, line items unfolded) or query a stored sheet with GET /view using where, sort, and select — no re-OCR, no extra charge.

Simple, predictable pricing

Pay $0.05 per image (¥10 / ₩100), with a free tier of 100 scans a month and no credit card. Flat plans add monthly scans, more sheets, and storage.

Free

100 scans / month
3 sheets
1 GB storage

Free — no card

Starter

$19/mo

400 scans / month
10 sheets
10 GB storage

Start free

Use AI on your documents without trusting it blindly

Free tier — 100 scans a month, no credit card. Every value the model produces comes back located and scored.

Start free API docs

Document OCR with an Audit Trail: Verify Every Extracted Value

How to Validate OCR Output Using Bounding Boxes

Best OCR Software for Receipts and Invoices (2026 Guide)