space ocr
GuidesArticlesPricingDocs
Invoice OCR

Invoice OCR that turns supplier bills into data you can trust

Stop retyping invoices. space-ocr reads supplier, numbers, dates, totals, and every line item — and returns each value with its on-page box and a match score.

Every invoice that lands in your inbox is a small data-entry tax. Someone opens the PDF, finds the supplier, the invoice number, the dates, the tax line, the total, then retypes it all into the accounting system — and copies the line items by hand if anyone needs them. It's slow, it's where the typos live, and a single fat-fingered total can hold up a payment run.

Invoice OCR is supposed to take that off your plate: read the bill, get the fields back. The problem with most tools is they hand you a number and ask you to trust it. space-ocr reads the invoice into structured rows and returns every value with the exact spot on the page it came from — a box you can see, plus a score for how well it matched. So before you approve a payment, you can check the figure instead of taking it on faith.

See a real invoice you can check

Hover any field below — the box on the invoice is where that value was read. The supplier, the issue date, the billing period, the due date, the billed amount, the running total, and each line item are all read straight from a real parsed result, not a mockup.

Invoice with extracted-field bounding boxes
Verified fields
Invoice

Each value with a box carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.

Every value located
Supplier, invoice number, issue and due dates, and every amount return with a bounding box (xmin/ymin/xmax/ymax on a 0–1000 grid), four oriented vertices, and a match_ratio — so a total traces back to the exact spot on the page.
Line items, not just totals
Request line items as an array field whose children describe one row (description, quantity, unit price, amount). Each cell keeps its own box, so a wrapped or merged line is still traceable.
Built-in invoice template
Pass templateId 'invoice' and the common fields come pre-defined — no schema to write. Need a different layout? Override or add your own fields.
Tax & totals
Subtotal, tax lines, and the grand total come back as their own fields, kept verbatim with their thousands separators, each with a box and a score you can verify before posting.
Clean exports
CSV with a UTF-8 BOM (Excel- and CJK-safe, line items unfolded into sub-rows) and JSON over a REST API — drop straight into your spreadsheet or accounting import.
AP automation
Post invoices to /upload as async jobs and get a signed webhook when each one is read, so new supplier bills flow into a sheet without anyone watching the queue.

How invoice OCR works in space-ocr

Drop an invoice into the app and it's read into a row: supplier, dates, amounts, and the line items as a sub-table you can sort, filter, and export. A PDF invoice is rendered to an image per page first, then read. If you're calling the API directly, send the page image (the public API takes raster images — JPEG, PNG, GIF, BMP, TIFF, WebP) and you get the same structured result back.

You don't have to describe an invoice from scratch. Pass the built-in templateId invoice, or define your own fields — including an array field whose children describe one line-item row.

extract invoice fields from a page image
1
2
3
4
5
6
7
8
curl -s https://api.space-ocr.com/ocr/fields \
  -H "Authorization: Bearer $SPACE_OCR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "https://example.com/invoice-page-1.png",
    "imageType": "url",
    "templateId": "invoice"
  }'

How to OCR an invoice

  1. Add the invoice
    In the app, drop the invoice (PDF or image) — each page is rendered to an image and queued for OCR. For AP automation, post it to /upload and get a webhook when it's read.
  2. Use the invoice template
    Pass the built-in templateId 'invoice' for supplier, numbers, dates, and totals, or supply your own fields — including an array field with children for the line items.
  3. Read the structured result
    Each value returns with its bbox, vertices, match_ratio, and bbox_source, plus a field_bboxes map locating every field on the invoice.
  4. Verify before you post
    Click any amount to highlight the exact region it was read from; a match_ratio below 0.85 flags a value worth a second look. Edits are stored beside the original OCR value.
  5. Export or query
    Download CSV (UTF-8 BOM, line items unfolded) for your accounting import, or query a stored sheet with GET /view using where, sort, and select — no re-OCR, no extra charge.

Simple, predictable pricing

Pay $0.05 per image (¥10 / ₩100), with a free tier of 100 scans a month and no credit card. Flat plans add monthly scans, more sheets, and storage.

Free
$0
  • 100 scans / month
  • 3 sheets
  • 1 GB storage
Free — no card
Starter
$19/mo
  • 400 scans / month
  • 10 sheets
  • 10 GB storage
Start free
Most popular
Pro
$49/mo
  • 1,100 scans / month
  • Unlimited sheets
  • 100 GB storage
Start free
What does invoice OCR pull off an invoice?
The supplier name, invoice number, issue and due dates, billing period, subtotal, tax, and grand total as their own fields — plus the line items as repeating rows (description, quantity, unit price, amount). Every value comes back with its on-page box and a match score.
Can it read the line items, not just the total?
Yes. Request line items as a field of type 'array' whose children describe one row. Each cell keeps its own bounding box, so a wrapped or merged line item is still traceable to its spot on the page, and it unfolds into sub-rows on export.
How do I know the total it read is right?
Every value returns with a bounding box (xmin/ymin/xmax/ymax on a 0–1000 grid), four oriented vertices, and a match_ratio. The output is validated against the real OCR symbols on the page; 0.85 or higher is a confident match, and 1.0 means every character was located. Click a cell to highlight the exact region it was read from.
Can I export invoices to CSV or feed them into accounting?
Yes. Download CSV with a UTF-8 BOM so Excel opens Japanese, Korean, and Chinese text correctly, with line items unfolded into sub-rows, or take JSON over the REST API. Push invoices to /upload as async jobs and a signed webhook fires when each is read.
Does it handle PDF invoices?
The web app accepts PDF invoices directly — it renders each page to an image and runs OCR. The public API takes raster images (JPEG, PNG, GIF, BMP, TIFF, WebP), so when calling the API you send the page image.
How much does invoice OCR cost?
$0.05 per image (¥10 / ₩100 per scan), with a free tier of 100 scans a month and no credit card. Flat plans (Starter and Pro) add monthly scans, more sheets, and storage — see the plans above.

Turn your supplier invoices into checkable data

Free tier — 100 scans a month, no credit card. Every value comes back with its on-page location.

Related