Extract line items from invoices automatically
Pull invoice and receipt line items into structured rows automatically. Declare an array field, get one verifiable row per item — each with a bounding box — and export to CSV.
Invoices and receipts are the documents people most want to digitize, and the hardest part is never the header. Vendor name, date, invoice number — those are single values an OCR model can grab in one shot. The pain is the table in the middle: a variable number of line items, each with a description, a quantity, and a price, that has to come out as clean rows you can total, reconcile, and load into a ledger.
This guide shows how to extract line items from invoices automatically with space-ocr — not as flattened text, but as a structured array where every line is its own row and every cell still points back to the exact spot on the page it was read from. If you are extracting whole documents rather than just the table, start with the broader invoice and receipt OCR walkthrough.
See it on a real receipt
The demo below runs on a real parsed fixture — a 640×640 receipt image with a 商品 (line-item) array. Hover any line and its box lights up on the image. Each row carries its own value, bounding box, and match ratio, so the table isn't a blob of text — it's a set of individually verifiable rows.

Every value carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.
The trick: declare line items as an array field
Most OCR APIs make you extract the table as a single string and parse it yourself. space-ocr lets you describe the line-item table as part of the schema. A FieldSpec with type: "array" and a children list tells the engine: this region repeats, and each repetition has these sub-fields.
Here is the exact schema behind the receipt in the demo. The 商品 ("items") field is an array whose children are 商品名 (name), 数量 (quantity), and 単価 (unit price):
{
"fields": [
{ "name": "店舗名", "type": "string", "description": "store name" },
{ "name": "日付", "type": "string", "description": "date" },
{ "name": "合計", "type": "string", "description": "total" },
{
"name": "商品",
"type": "array",
"description": "one row per line item",
"children": [
{ "name": "商品名", "type": "string", "description": "item name" },
{ "name": "数量", "type": "string", "description": "quantity" },
{ "name": "単価", "type": "string", "description": "unit price" }
]
}
]
}Post that to POST /ocr/fields with the image, and the array field comes back as a list. The receipt in the demo yields 10 line items — ポッカレモン100 at 359, シール割引 at -34 (a discount line, sign preserved), エキストラBオリー at 698, and so on. You didn't write a row parser, a column splitter, or a regex. You declared the shape once.
curl -s https://api.space-ocr.com/ocr/fields \
-H "Authorization: Bearer $SPACE_OCR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"image": "https://example.com/receipt.jpg",
"imageType": "url",
"fields": [
{ "name": "total", "type": "string" },
{ "name": "items", "type": "array",
"children": [
{ "name": "description", "type": "string" },
{ "name": "qty", "type": "string" },
{ "name": "unit_price", "type": "string" }
] }
]
}'Each line item is independently verifiable
This is where line-item extraction usually goes wrong: the model returns a tidy-looking table that's subtly misaligned — a price shifted up a row, a description merged with the one below. With space-ocr, every array item carries its own bbox, vertices, match_ratio, and a field_bboxes map for its children. A single line of the receipt looks like this:
{
"商品名": "ポッカレモン100",
"単価": "359",
"match_ratio": 1.0,
"bbox_source": "vision_symbol_match",
"field_bboxes": {
"単価": {
"bbox": { "xmin": 450, "ymin": 356, "xmax": 484, "ymax": 378 },
"vertices": [
{ "x": 450, "y": 360 }, { "x": 483, "y": 356 },
{ "x": 485, "y": 374 }, { "x": 452, "y": 378 }
],
"match_ratio": 1.0
}
}
}So a price isn't just 359 — it's 359 located at a specific rectangle on a 0–1000 normalized grid (xmin/ymin/xmax/ymax, top-left origin), with four oriented vertices that follow the document's tilt, and a match_ratio saying how much of that text was actually found on the page. A match_ratio of 1.0 means every character was located; the engine treats ≥ 0.85 as a confident match. You can sort your extracted rows by match ratio and only eyeball the weakest ones. For the full mechanics, see validating OCR with bounding boxes.
The model never invents those coordinates. The language model returns each line item's values plus the word-token IDs it used — it does not return boxes. The engine resolves those tokens to regions the vision OCR layer actually detected and unions them. A phantom line item has no tokens to anchor to, so it can't be handed a fake position. That's what makes a 30-row table checkable instead of merely plausible.
Click a line, land on the pixels
Because every line item knows where it lives, spot-checking a table becomes a click. In the app you click any cell — a description, a quantity, a unit price — and the source image highlights the exact region that value came from, with a zoomed crop. For an invoice with thirty lines, your eye goes straight to the one that looks off instead of scanning the whole page.
From line items to a CSV your accounting tool can read
Once line items are stored in a sheet, exporting is where the array shape pays off again. space-ocr expands array fields on export: the header becomes # plus the scalar columns, plus one column per array child named colName.childName (so 商品.商品名, 商品.数量, 商品.単価). Each line item becomes its own sub-row — a receipt with 10 items produces 10 rows, all carrying the same store name and date. That's exactly the long, flat format spreadsheets and ledger importers expect.
A trimmed export for the demo receipt looks like this:
| # | 店舗名 | 日付 | 商品.商品名 | 商品.単価 |
|---|---|---|---|---|
| 1 | KINSHO | 2019年08月17日 | ポッカレモン100 | 359 |
| 2 | KINSHO | 2019年08月17日 | エキストラBオリー | 698 |
| 3 | KINSHO | 2019年08月17日 | シール割引 | -34 |
The file is UTF-8 with a BOM, so Japanese, Korean, and Chinese item names open cleanly in Excel. Any manual correction you made overrides the OCR value in the export, while the original stays on record. For the end-to-end image-folder-to-spreadsheet flow, see scanned documents to CSV.
Do it in a few steps
- Define an array field for the line itemsIn your fields[] schema, add a field with type "array" and a children list — e.g. description, qty, unit_price. This tells the engine the line-item region repeats with those sub-fields.
- Send the invoice to /ocr/fieldsPOST the image (as a URL or base64) with imageType and your fields[] to https://api.space-ocr.com/ocr/fields. The array field comes back as a list, one object per line item.
- Verify each lineEvery array item carries its own bbox, vertices, and match_ratio. Sort by match_ratio or click a cell in the app to jump to the exact region on the image and confirm the value.
- Export to CSVExport the sheet — array children expand into colName.childName columns and each line item becomes its own row, with document-level fields repeated, ready for your accounting tool.
How do I extract line items from invoices automatically?
Can OCR handle a variable number of line items per invoice?
How do line items appear in the CSV export?
How do I know a line item was read correctly?
Does it work on non-English invoices?
Extract line items from your own invoices
Free tier — 100 scans a month, no credit card. Declare an array field once and every line comes back as a verifiable row.