space ocr
ArticlesDocs
receipts

Extract line items from invoices automatically

Pull invoice and receipt line items into structured rows automatically. Declare an array field, get one verifiable row per item — each with a bounding box — and export to CSV.

7 min read· 2026-06-25

Invoices and receipts are the documents people most want to digitize, and the hardest part is never the header. Vendor name, date, invoice number — those are single values an OCR model can grab in one shot. The pain is the table in the middle: a variable number of line items, each with a description, a quantity, and a price, that has to come out as clean rows you can total, reconcile, and load into a ledger.

This guide shows how to extract line items from invoices automatically with space-ocr — not as flattened text, but as a structured array where every line is its own row and every cell still points back to the exact spot on the page it was read from. If you are extracting whole documents rather than just the table, start with the broader invoice and receipt OCR walkthrough.

See it on a real receipt

The demo below runs on a real parsed fixture — a 640×640 receipt image with a 商品 (line-item) array. Hover any line and its box lights up on the image. Each row carries its own value, bounding box, and match ratio, so the table isn't a blob of text — it's a set of individually verifiable rows.

Source receipts with extracted-field bounding boxes
Verified fields
KINSHO · 合計 2,045
ライフ · 合計 4,286

Every value carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.

The trick: declare line items as an array field

Most OCR APIs make you extract the table as a single string and parse it yourself. space-ocr lets you describe the line-item table as part of the schema. A FieldSpec with type: "array" and a children list tells the engine: this region repeats, and each repetition has these sub-fields.

Here is the exact schema behind the receipt in the demo. The 商品 ("items") field is an array whose children are 商品名 (name), 数量 (quantity), and 単価 (unit price):

fields[] — line items as an array
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
  "fields": [
    { "name": "店舗名", "type": "string", "description": "store name" },
    { "name": "日付",   "type": "string", "description": "date" },
    { "name": "合計",   "type": "string", "description": "total" },
    {
      "name": "商品",
      "type": "array",
      "description": "one row per line item",
      "children": [
        { "name": "商品名", "type": "string", "description": "item name" },
        { "name": "数量",   "type": "string", "description": "quantity" },
        { "name": "単価",   "type": "string", "description": "unit price" }
      ]
    }
  ]
}

Post that to POST /ocr/fields with the image, and the array field comes back as a list. The receipt in the demo yields 10 line itemsポッカレモン100 at 359, シール割引 at -34 (a discount line, sign preserved), エキストラBオリー at 698, and so on. You didn't write a row parser, a column splitter, or a regex. You declared the shape once.

extract line items from one invoice
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
curl -s https://api.space-ocr.com/ocr/fields \
  -H "Authorization: Bearer $SPACE_OCR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "https://example.com/receipt.jpg",
    "imageType": "url",
    "fields": [
      { "name": "total", "type": "string" },
      { "name": "items", "type": "array",
        "children": [
          { "name": "description", "type": "string" },
          { "name": "qty",         "type": "string" },
          { "name": "unit_price",  "type": "string" }
        ] }
    ]
  }'

Each line item is independently verifiable

This is where line-item extraction usually goes wrong: the model returns a tidy-looking table that's subtly misaligned — a price shifted up a row, a description merged with the one below. With space-ocr, every array item carries its own bbox, vertices, match_ratio, and a field_bboxes map for its children. A single line of the receipt looks like this:

one item of the 商品 array (abridged)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
  "商品名": "ポッカレモン100",
  "単価": "359",
  "match_ratio": 1.0,
  "bbox_source": "vision_symbol_match",
  "field_bboxes": {
    "単価": {
      "bbox": { "xmin": 450, "ymin": 356, "xmax": 484, "ymax": 378 },
      "vertices": [
        { "x": 450, "y": 360 }, { "x": 483, "y": 356 },
        { "x": 485, "y": 374 }, { "x": 452, "y": 378 }
      ],
      "match_ratio": 1.0
    }
  }
}

So a price isn't just 359 — it's 359 located at a specific rectangle on a 0–1000 normalized grid (xmin/ymin/xmax/ymax, top-left origin), with four oriented vertices that follow the document's tilt, and a match_ratio saying how much of that text was actually found on the page. A match_ratio of 1.0 means every character was located; the engine treats ≥ 0.85 as a confident match. You can sort your extracted rows by match ratio and only eyeball the weakest ones. For the full mechanics, see validating OCR with bounding boxes.

✓ Verified

The model never invents those coordinates. The language model returns each line item's values plus the word-token IDs it used — it does not return boxes. The engine resolves those tokens to regions the vision OCR layer actually detected and unions them. A phantom line item has no tokens to anchor to, so it can't be handed a fake position. That's what makes a 30-row table checkable instead of merely plausible.

Click a line, land on the pixels

Because every line item knows where it lives, spot-checking a table becomes a click. In the app you click any cell — a description, a quantity, a unit price — and the source image highlights the exact region that value came from, with a zoomed crop. For an invoice with thirty lines, your eye goes straight to the one that looks off instead of scanning the whole page.

DemoClick any line-item cell → the matching region lights up on the original invoice.
Click any line-item cell → the matching region lights up on the original invoice.

From line items to a CSV your accounting tool can read

Once line items are stored in a sheet, exporting is where the array shape pays off again. space-ocr expands array fields on export: the header becomes # plus the scalar columns, plus one column per array child named colName.childName (so 商品.商品名, 商品.数量, 商品.単価). Each line item becomes its own sub-row — a receipt with 10 items produces 10 rows, all carrying the same store name and date. That's exactly the long, flat format spreadsheets and ledger importers expect.

DemoExport the sheet — array line items expand into one <b>row per item</b> with <b>colName.childName</b> columns.
Export the sheet — array line items expand into one row per item with colName.childName columns.

A trimmed export for the demo receipt looks like this:

#店舗名日付商品.商品名商品.単価
1KINSHO2019年08月17日ポッカレモン100359
2KINSHO2019年08月17日エキストラBオリー698
3KINSHO2019年08月17日シール割引-34

The file is UTF-8 with a BOM, so Japanese, Korean, and Chinese item names open cleanly in Excel. Any manual correction you made overrides the OCR value in the export, while the original stays on record. For the end-to-end image-folder-to-spreadsheet flow, see scanned documents to CSV.

Do it in a few steps

  1. Define an array field for the line items
    In your fields[] schema, add a field with type "array" and a children list — e.g. description, qty, unit_price. This tells the engine the line-item region repeats with those sub-fields.
  2. Send the invoice to /ocr/fields
    POST the image (as a URL or base64) with imageType and your fields[] to https://api.space-ocr.com/ocr/fields. The array field comes back as a list, one object per line item.
  3. Verify each line
    Every array item carries its own bbox, vertices, and match_ratio. Sort by match_ratio or click a cell in the app to jump to the exact region on the image and confirm the value.
  4. Export to CSV
    Export the sheet — array children expand into colName.childName columns and each line item becomes its own row, with document-level fields repeated, ready for your accounting tool.
How do I extract line items from invoices automatically?
Declare the line-item table as a field with type "array" and a children list (for example description, qty, unit_price), then POST the image to /ocr/fields. The engine returns the array as a list of rows — one object per line item — without you writing any table-parsing code. Each item also carries its own bounding box, vertices, and match ratio.
Can OCR handle a variable number of line items per invoice?
Yes. An array field doesn't assume a fixed row count. The receipt in the demo yields 10 items; another invoice might yield 30. The engine groups repeated rows from the detected text layout, so you get as many line-item objects as there are lines on the page, each independently located on the image.
How do line items appear in the CSV export?
Array fields expand on export. The header is '#' plus the scalar columns plus one column per array child, named colName.childName (e.g. items.description, items.qty, items.unit_price). Each line item becomes its own sub-row, repeating the document-level fields like vendor and date, which is the flat format ledger and spreadsheet importers expect. The file is UTF-8 with a BOM for clean CJK in Excel.
How do I know a line item was read correctly?
Every array item ships with a match_ratio (the fraction of its characters located on the page) and a bounding box. A match_ratio of 1.0 means every character was found; the engine treats 0.85 and above as a confident match. You can sort rows by match ratio to review only the weakest, or click a cell in the app to highlight the exact region it came from.
Does it work on non-English invoices?
Yes. Language detection is automatic — Japanese, Korean, Chinese, and English run through one engine, including full-width characters and vertical CJK text. The demo extracts a Japanese receipt's 商品 (items) array with 商品名, 数量, and 単価 children. There's no language flag to set.

Extract line items from your own invoices

Free tier — 100 scans a month, no credit card. Declare an array field once and every line comes back as a verifiable row.

Related