space ocr
GuidesArticlesPricingDocs
OCR API

An OCR API that returns data you can verify

One REST call returns structured JSON where every field carries a bounding box and a match score. Bearer auth, built-in templates, async jobs, signed webhooks.

Most OCR APIs hand you a wall of text and a confidence number for the whole page. You still have to find the invoice total, parse it, and hope it landed in the right place. The OCR API in space-ocr does the structuring for you: one POST with an image and a template, and you get back typed fields as JSON.

The part that matters for production is what rides along with each value. Every field comes back with the exact box on the page it was read from, the four corners of that box, and a match score. So your pipeline doesn't have to trust a model's word — it can check each value against where it actually sits on the document.

A real response you can inspect

Hover any field below — the box on the invoice is where that value was read. This is a real parsed result: the billing name ソジュハンザン海物語様, the amount due ¥84,263, the total ¥46,752, each line item, all returned with their own box and a match score. Nothing here is mocked.

Invoice with extracted-field bounding boxes
Verified fields
Invoice

Each value with a box carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.

One call, JSON with boxes
POST /ocr/fields with one image and get typed fields back. Each value carries its bbox, so you skip the second pass of finding where things are.
bbox, vertices, match_ratio
Every field returns xmin/ymin/xmax/ymax on a 0–1000 grid, four oriented vertices that follow the page tilt, and a match_ratio you can threshold on.
Built-in templates
Pass a templateId — receipt, invoice, delivery, business_card, driver_license, and more — or send your own fields, including an array field for line items.
Async jobs + signed webhooks
POST /upload to queue images, get a job per file, and receive an HMAC-SHA256 signed webhook on completion — or poll GET /jobs/{jobId}.
CSV and JSON exports
JSON over REST, plus CSV with a UTF-8 BOM (Excel- and CJK-safe) where line items unfold into sub-rows for a stored sheet.
Languages on autopilot
Japanese, Korean, Chinese, and English in one engine — no language hint to set, mixed scripts and full-width characters handled.

How the OCR API works in space-ocr

Authenticate with a Bearer token — your key is prefixed spocr_ — against the base URL https://api.space-ocr.com. Send one raster image to POST /ocr/fields as a URL or base64 (the public API takes images — JPEG, PNG, GIF, BMP, TIFF, WebP — so for a PDF you send page images). Pass a built-in templateId or your own fields, and you get back { status: 'success', data: {...} } with a value, bbox, vertices, and match_ratio per field.

The coordinates aren't invented by the model. The LLM returns each value plus the word-token ids it used; a character matcher then aligns that value against the symbols Google Vision actually detected on the page and scores the coverage as the match_ratio. A score of 0.85 or higher is a confident match, and 1.0 means every character was located. Every response also carries an X-Request-Id header, and errors come back as { error: { code, message, requestId } }.

extract fields from an image
1
2
3
4
5
6
7
8
curl -s https://api.space-ocr.com/ocr/fields \
  -H "Authorization: Bearer $SPACE_OCR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "https://example.com/invoice.png",
    "imageType": "url",
    "templateId": "invoice"
  }'
the same call in Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import os, requests

resp = requests.post(
    "https://api.space-ocr.com/ocr/fields",
    headers={"Authorization": f"Bearer {os.environ['SPACE_OCR_API_KEY']}"},
    json={
        "image": "https://example.com/invoice.png",
        "imageType": "url",
        "templateId": "invoice",
    },
    timeout=60,
)
resp.raise_for_status()
for name, field in resp.json()["data"].items():
    print(name, field["value"], field["bbox"], field["match_ratio"])

How to call the OCR API

  1. Get an API key
    Sign in and create a key — it is prefixed spocr_. Send it as Authorization: Bearer <key> on every request to https://api.space-ocr.com.
  2. Send an image
    POST /ocr/fields with image (a URL or pure base64) and imageType. For a PDF, send the page images — the API takes raster formats (JPEG, PNG, GIF, BMP, TIFF, WebP).
  3. Pick a template or fields
    Pass a built-in templateId like 'invoice' or 'receipt', or supply your own fields — including an array field with children for line-item tables.
  4. Read the structured result
    You get { status: 'success', data: {...} } where each value carries its bbox, vertices, match_ratio, and bbox_source. Threshold on match_ratio to flag anything below 0.85.
  5. Scale out and query
    Queue many images with POST /upload (job per file, signed webhooks or GET /jobs/{jobId}), then read a stored sheet with GET /view using where, sort, and select — no re-OCR, no extra charge.

Simple, predictable pricing

Pay $0.05 per image (¥10 / ₩100), with a free tier of 100 scans a month and no credit card. Reading a stored sheet back with GET /view doesn't re-OCR and isn't charged. Flat plans add monthly scans, more sheets, and storage.

Free
$0
  • 100 scans / month
  • 3 sheets
  • 1 GB storage
Free — no card
Starter
$19/mo
  • 400 scans / month
  • 10 sheets
  • 10 GB storage
Start free
Most popular
Pro
$49/mo
  • 1,100 scans / month
  • Unlimited sheets
  • 100 GB storage
Start free
How do I authenticate with the OCR API?
Send an HTTP Bearer token on every request — Authorization: Bearer <key>. Keys are prefixed spocr_. The base URL is https://api.space-ocr.com with no version path. A missing or invalid header returns 401, an unrecognized key returns 403, and every response carries an X-Request-Id header for support.
What does the OCR API return for each field?
A value, a bounding box (xmin/ymin/xmax/ymax on a 0–1000 normalized grid, not pixels), four oriented vertices that follow the document's tilt, a match_ratio, and a bbox_source. A match_ratio of 0.85 or higher is a confident match, and 1.0 means every character was located on the page.
Can the OCR API read a PDF?
The public API takes raster images — JPEG, PNG, GIF, BMP, TIFF, WebP — so for a PDF you send the page images. The web app accepts PDFs directly and renders each page to an image before OCR. The structured result is the same either way.
Does the OCR API handle large or batch jobs?
Yes. POST /upload accepts one or more images and returns a job per file with status 'pending'. Completion arrives as an HMAC-SHA256 signed webhook (X-Spaceocr-Signature), or you can poll GET /jobs/{jobId}. POST /ocr/fields stays synchronous for a single image.
Are there rate limits and error codes?
The limit is 60 requests per minute per key. Over it you get 429 with code 'rate_limited' and a wait time in the body at details.retryAfterSec (not a Retry-After header). All errors share the envelope { error: { code, message, requestId } } across 400, 401, 402, 404, 429, 500, and 502.
How much does the OCR API cost?
$0.05 per image (¥10 / ₩100 per call), with a free tier of 100 scans a month and no credit card. POST /ocr/fields and each image in POST /upload cost one scan; GET /space, /view, and /amount are free. Flat plans (Starter and Pro) add monthly scans, sheets, and storage — see the plans above.

Ship OCR that returns checkable data

Free tier — 100 scans a month, no credit card. Every field comes back with its box and a match score.

Related