space ocr
ArticlesDocs
Guide

An API for extracting data from invoices

A developer guide to space-ocr's API for extracting data from invoices: POST /ocr/fields with curl & Python, invoice templates, custom fields, and verified bounding boxes.

8 min read· 2026-06-25

Pulling structured data off an invoice — vendor, invoice number, date, line totals, tax — is one of the most common document-automation jobs there is, and one of the most tedious to build by hand. Regex over OCR text breaks the moment a vendor changes their layout. Template-matching tools want you to draw boxes for every supplier. What you actually want is an API for extracting data from invoices that reads any layout, returns clean typed fields, and — crucially — tells you where on the page each value came from so you can trust the result.

That last part is the whole game. An invoice extraction endpoint that hands back total: 2,045 with no provenance is a liability in an accounts-payable pipeline. This guide walks through space-ocr's POST /ocr/fields endpoint: a single synchronous call that takes one invoice image, applies a built-in invoice template (or your own field schema), and returns every value with a verified bounding box.

See the output before you write a line of code

Below is a real parsed receipt. Hover any field and the box on the image lights up — that box is exactly where the value was read from, and each field carries its own match ratio. Invoices behave the same way: every field you extract lands back on the pixels it came from.

Source receipts with extracted-field bounding boxes
Verified fields
KINSHO · 合計 2,045
ライフ · 合計 4,286

Every value carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.

DemoEvery extracted value ships with a bounding box, oriented vertices, and a <b>match ratio</b> — provenance you can render or cite.
Every extracted value ships with a bounding box, oriented vertices, and a match ratio — provenance you can render or cite.

Authentication and base URL

The public API lives at a single base — https://api.space-ocr.com — with no /v1 path versioning. Every request authenticates with an HTTP Bearer token whose key is prefixed spocr_:

Authorization: Bearer spocr_xxxxxxxxxxxxxxxx

A missing or malformed header returns 401; an unrecognized key returns 403. Every response carries an X-Request-Id header (format req_xxx) you should log for support traces. The full spec is published as OpenAPI 3.1 at GET /openapi.json if you'd rather generate a client.

The simplest call: a built-in invoice template

The fastest path is templateId: "invoice" — a predefined schema that knows what an invoice looks like, so you don't have to describe fields yourself. Pass the image as a URL or pure base64 (the imageType is auto-detected from an http(s):// prefix), and you get typed fields back.

POST /ocr/fields with the built-in invoice template
1
2
3
4
5
6
7
8
curl -X POST https://api.space-ocr.com/ocr/fields \
  -H "Authorization: Bearer spocr_xxxxxxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "https://example.com/invoices/inv-4471.jpg",
    "imageType": "url",
    "templateId": "invoice"
  }'
Why it matters

Camel-case is the canonical form. Parameters are imageType, templateId, autoFields. The legacy snake_case aliases (image_type, template_id, auto_fields) still work but are deprecated — prefer the camel-case names in new code.

The response shape

A successful call returns { status: "success", data: { ... } }. Each extracted value carries its own provenance, and a field_bboxes map gives coordinates per field:

  • bbox — an axis-aligned rectangle { xmin, ymin, xmax, ymax } on a 0–1000 normalized grid (0,0 = top-left, 1000,1000 = bottom-right), independent of the image's pixel size. Convert to pixels with pixel_x = bbox_x / 1000 × image_width.
  • vertices — four ordered points {x, y} (top-left → top-right → bottom-right → bottom-left) forming an oriented box that follows the document's tilt, so a skewed phone photo of an invoice still boxes cleanly.
  • match_ratio — the fraction of the value's characters actually located on the page (0–1). A field is treated as confidently matched at ≥ 0.85; 1.0 means every character was found.
  • bbox_source — how the coordinate was derived: token_id (the deterministic word-token lookup, match ratio 1.0), token_id_hybrid, vision_symbol_match, low_confidence, or shared_value.
POST /ocr/fields → response (abridged)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
  "status": "success",
  "data": {
    "total": "2,045",
    "field_bboxes": {
      "total": {
        "bbox": { "xmin": 595, "ymin": 974, "xmax": 781, "ymax": 1000 },
        "vertices": [
          { "x": 594, "y": 975 }, { "x": 781, "y": 972 },
          { "x": 781, "y": 998 }, { "x": 595, "y": 1000 }
        ],
        "match_ratio": 1.0,
        "bbox_source": "token_id"
      }
    }
  }
}
✓ Verified

The model never invents the coordinates. The language model returns each value plus the IDs of the word tokens it used — it does not return bounding boxes. The engine then looks those tokens up in the underlying vision OCR and unions their boxes. A model can hallucinate text; it cannot hallucinate a position the vision layer never detected. A value that isn't really on the invoice has no tokens to anchor to. See why bounding boxes make OCR auditable for the full reasoning.

Custom fields when the template isn't enough

Real invoices have fields a generic template won't name — a PO reference, a payment-terms code, a project tag. Pass a fields array of FieldSpec objects instead of (or alongside) a template. Each FieldSpec is { name, type, description?, children? }. If you send both fields and templateId, fields wins.

The description is where you steer the model: it's plain English instructions for what to capture and how. And type: "array" with children is how you pull repeating line items — one child schema, many rows. (We go deep on that in extracting line items from invoices.)

Custom FieldSpec with nested line items
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import requests, base64

with open("invoice.jpg", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

resp = requests.post(
    "https://api.space-ocr.com/ocr/fields",
    headers={"Authorization": "Bearer spocr_xxxxxxxxxxxxxxxx"},
    json={
        "image": b64,
        "imageType": "base64",
        "fields": [
            {"name": "vendor", "type": "string",
             "description": "Supplier / billing company name"},
            {"name": "invoice_no", "type": "string",
             "description": "Invoice number, verbatim"},
            {"name": "invoice_date", "type": "string"},
            {"name": "total", "type": "string",
             "description": "Grand total, keep comma separators"},
            {"name": "line_items", "type": "array",
             "description": "One row per line on the invoice",
             "children": [
                 {"name": "description", "type": "string"},
                 {"name": "qty", "type": "number"},
                 {"name": "unit_price", "type": "number"},
             ]},
        ],
    },
    timeout=60,
)

data = resp.json()["data"]
print(data["total"], data["field_bboxes"]["total"]["match_ratio"])
Why it matters

Values are preserved verbatim. A total of 7,855 comes back as the string "7,855" — comma separators, decimals, and full-width characters intact. The engine only normalizes when a field's description explicitly asks it to. The ¥ you see in the web UI is decoration, not part of the value. The engine accepts raster images only — JPEG, PNG, GIF, BMP, TIFF, WebP — and auto-converts to RGB.

Going async: batch uploads, jobs, and webhooks

POST /ocr/fields is synchronous and perfect for a single invoice in a request/response loop. For a folder of invoices, post them to a sheet with POST /upload (multipart files, repeated). By default it returns immediately with a jobs array:

{ "path": "...", "jobs": [ { "uniqueKey": "...", "jobId": "...", "status": "pending" } ] }

You then learn the outcome two ways: poll GET /jobs/{jobId}, or register a webhook. Webhooks are one URL per space, HMAC-SHA256 signed via the X-Spaceocr-Signature header. The events you'll care about are upload.received, item.created, ocr.completed (with data.result carrying the extraction), and ocr.failed. Always verify the signature before trusting a payload.

Idempotency, request tracing, and rate limits

A few headers make a production pipeline safe to retry:

HeaderPurpose
Idempotency-KeyOn /upload and /create, a repeat with the same key replays the cached response for 24h (X-Idempotent-Replay: true) — safe retries with no double charge.
X-Request-IdReturned on every response (req_xxx); log it for support.

Rate limits are 60 requests/min per key and 600 requests/min per uid, on a fixed 60-second window. Exceed them and you get 429 with error.code: "rate_limited". The wait time is in the JSON body as details.retryAfterSecthere is no Retry-After HTTP header, so back off on the body value.

429 response body
1
2
3
4
5
6
7
8
{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded",
    "requestId": "req_8fa2c1"
  },
  "details": { "retryAfterSec": 12 }
}

From extraction to a queryable sheet

Once invoices are extracted into a sheet, you don't re-run OCR to read them back. GET /view runs server-side queries over stored rows — where, sort, select, limit, offset — with no charge and no re-extraction. The bounding boxes ride along by default; add boxes=0 for a leaner payload. From there you can export to CSV (UTF-8 BOM, so Excel and CJK text open cleanly) — see turning scanned documents into CSV.

DemoDrop an invoice and the typed fields fill in — the same data the <b>API</b> returns, in the UI.
Drop an invoice and the typed fields fill in — the same data the API returns, in the UI.

Pricing

POST /ocr/fields costs ¥10 per call, and POST /upload is ¥10 × N images. There's no charge on failure — if OCR returns no result, it's refunded, and a 502 engine error or ocr.failed event refunds automatically. Read-only endpoints (GET /space, /view, /amount, /health) are free. The free tier is 100 scans/month with no credit card; Pro is $39/month; Business is contact-sales.

DemoSearch across extracted invoices and jump straight to the matching cell — and its source box.
Search across extracted invoices and jump straight to the matching cell — and its source box.

How to extract data from an invoice with the API

What is the best API for extracting data from invoices?
A good invoice extraction API reads any layout (not just templates you pre-draw), returns clean typed fields, and gives you provenance for each value. space-ocr's POST /ocr/fields does this in one synchronous call: pass an invoice image with templateId 'invoice' or your own fields[] schema, and every value comes back with a bounding box, oriented vertices, and a match ratio so you can verify it against the source.
Can I extract invoice line items as well as header fields?
Yes. Use a FieldSpec with type 'array' and a children schema describing one row (e.g. description, qty, unit_price). The API returns one row per line item on the invoice, each with its own bounding box. Header fields like vendor, invoice number, date, and total are extracted in the same call.
Does the invoice extraction API accept PDFs?
The engine accepts raster images only — JPEG, PNG, GIF, BMP, TIFF, and WebP — and auto-converts to RGB. Pass the image as a URL or as pure base64 in the 'image' field, with imageType set to 'url' or 'base64'.
How are invoice extraction API errors and rate limits handled?
Rate limits are 60 requests/min per key and 600/min per uid on a fixed 60-second window. Exceeding them returns HTTP 429 with error.code 'rate_limited' and the wait time in the JSON body as details.retryAfterSec — there is no Retry-After header. Use an Idempotency-Key on /upload and /create so retries replay the cached response for 24 hours instead of re-charging.
How much does it cost to extract data from an invoice?
POST /ocr/fields costs ¥10 per call, and /upload is ¥10 per image. There is no charge on failure — if OCR returns no result it is refunded automatically. The free tier includes 100 scans a month with no credit card; Pro is $39/month and Business is contact-sales.

Extract your first invoice in one call

Free tier — 100 scans a month, no credit card. Every field comes back with its on-page location.

Related