Invoice OCR

Invoice OCR that turns supplier bills into data you can trust

Stop retyping invoices. space-ocr reads supplier, numbers, dates, totals, and every line item — and returns each value with its on-page box and a match score.

Every invoice that lands in your inbox is a small data-entry tax. Someone opens the PDF, finds the supplier, the invoice number, the dates, the tax line, the total, then retypes it all into the accounting system — and copies the line items by hand if anyone needs them. It's slow, it's where the typos live, and a single fat-fingered total can hold up a payment run.

Invoice OCR is supposed to take that off your plate: read the bill, get the fields back. The problem with most tools is they hand you a number and ask you to trust it. space-ocr reads the invoice into structured rows and returns every value with the exact spot on the page it came from — a box you can see, plus a score for how well it matched. So before you approve a payment, you can check the figure instead of taking it on faith.

See a real invoice you can check

Hover any field below — the box on the invoice is where that value was read. The supplier, the issue date, the billing period, the due date, the billed amount, the running total, and each line item are all read straight from a real parsed result, not a mockup.

Verified fields

Invoice

Each value with a box carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.

Every value located

Supplier, invoice number, issue and due dates, and every amount return with a bounding box (xmin/ymin/xmax/ymax on a 0–1000 grid), four oriented vertices, and a match_ratio — so a total traces back to the exact spot on the page.

Line items, not just totals

Request line items as an array field whose children describe one row (description, quantity, unit price, amount). Each cell keeps its own box, so a wrapped or merged line is still traceable.

Built-in invoice template

Pass templateId 'invoice' and the common fields come pre-defined — no schema to write. Need a different layout? Override or add your own fields.

Tax & totals

Subtotal, tax lines, and the grand total come back as their own fields, kept verbatim with their thousands separators, each with a box and a score you can verify before posting.

Clean exports

CSV with a UTF-8 BOM (Excel- and CJK-safe, line items unfolded into sub-rows) and JSON over a REST API — drop straight into your spreadsheet or accounting import.

AP automation

Post invoices to /upload as async jobs and get a signed webhook when each one is read, so new supplier bills flow into a sheet without anyone watching the queue.

How invoice OCR works in space-ocr

Drop an invoice into the app and it's read into a row: supplier, dates, amounts, and the line items as a sub-table you can sort, filter, and export. A PDF invoice is rendered to an image per page first, then read. If you're calling the API directly, send the page image (the public API takes raster images — JPEG, PNG, GIF, BMP, TIFF, WebP) and you get the same structured result back.

You don't have to describe an invoice from scratch. Pass the built-in templateId invoice, or define your own fields — including an array field whose children describe one line-item row.

extract invoice fields from a page image

curl -s https://api.space-ocr.com/ocr/fields \
  -H "Authorization: Bearer $SPACE_OCR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "https://example.com/invoice-page-1.png",
    "imageType": "url",
    "templateId": "invoice"
  }'

How to OCR an invoice

Add the invoice
In the app, drop the invoice (PDF or image) — each page is rendered to an image and queued for OCR. For AP automation, post it to /upload and get a webhook when it's read.
Use the invoice template
Pass the built-in templateId 'invoice' for supplier, numbers, dates, and totals, or supply your own fields — including an array field with children for the line items.
Read the structured result
Each value returns with its bbox, vertices, match_ratio, and bbox_source, plus a field_bboxes map locating every field on the invoice.
Verify before you post
Click any amount to highlight the exact region it was read from; a match_ratio below 0.85 flags a value worth a second look. Edits are stored beside the original OCR value.
Export or query
Download CSV (UTF-8 BOM, line items unfolded) for your accounting import, or query a stored sheet with GET /view using where, sort, and select — no re-OCR, no extra charge.

Simple, predictable pricing

Pay $0.05 per image (¥10 / ₩100), with a free tier of 100 scans a month and no credit card. Flat plans add monthly scans, more sheets, and storage.

Free

100 scans / month
3 sheets
1 GB storage

Free — no card

Starter

$19/mo

400 scans / month
10 sheets
10 GB storage

Start free

Turn your supplier invoices into checkable data

Free tier — 100 scans a month, no credit card. Every value comes back with its on-page location.

Start free API docs

API for Extracting Data From Invoices: A Developer Guide

Extract Line Items From Invoices Automatically | space-ocr

Convert a Scanned PDF to Excel: Page Images to CSV