Japanese OCR that returns data you can check
Read Japanese receipts, invoices, and delivery notes with space-ocr: mixed scripts, full-width and vertical text, CJK-safe CSV, every value located with a box.
Japanese is where ordinary OCR quietly falls apart. A single receipt mixes kanji, kana, half-width katakana, full-width digits, and a stray run of English, and the totals might sit in a vertical column down the right edge. Most tools either force you to pick a language first or hand back a flat blob of text that loses the layout. Japanese OCR that actually helps has to read all of that at once and tell you where each number came from.
space-ocr does both. It reads JP documents and returns structured fields, and it returns every value with the exact spot on the page it was read from — a box you can see, plus a score for how well the text matched the characters detected on the page. Language detection is automatic, so there is no hint to set; one engine handles Japanese, Korean, Chinese, and English together.
See a real Japanese extraction you can check
Hover any field below. The two receipts read here are real — a KINSHO 布施店 slip totalling 2,045 and a ライフ 国分店 slip totalling 4,286, both dated August 2019. Every value, box, and match score comes straight from a parsed result, not a mockup, and the boxes follow each line of mixed kanji-kana-digit text.

Each value with a box carries a verified on-page location — bbox + 4-point vertices + match_ratio — on a 0–1000 normalized grid (0,0 top-left → 1000,1000 bottom-right), the same shape the live API returns. Hover a field to trace it back to the pixels it came from.
How Japanese OCR works in space-ocr
The LLM never invents coordinates. It reads the document, returns each value plus the word-token ids it used, and a character matcher runs first to match those characters against the symbols Vision actually detected on the page. That match produces the box, the oriented vertices, and the match_ratio; the token ids are a secondary override. So full-width and half-width forms of the same digit still resolve to one value, and you get a confidence signal for every field instead of a number you have to trust blindly.
Drop a PDF into the app and each page is rendered to an image first, then read — handy for multi-page invoices and delivery notes. If you call the API directly, send the page images (the public API takes raster images — JPEG, PNG, GIF, BMP, TIFF, WebP) and the structured result is the same. Pass a built-in templateId like receipt, invoice, or delivery, or define your own fields including an array field whose children describe one line-item row.
curl -s https://api.space-ocr.com/ocr/fields \
-H "Authorization: Bearer $SPACE_OCR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"image": "https://example.com/receipt-jp.jpg",
"imageType": "url",
"templateId": "receipt"
}'How to OCR a Japanese document
- Add your documentIn the app, drop a receipt, invoice, or PDF — each page is rendered to an image and queued for OCR. For the API, send the page images (url or base64) to /ocr/fields. No language setting is needed.
- Pick a template or fieldsPass a built-in templateId like 'receipt', 'invoice', or 'delivery', or supply your own fields — including an array field with children for line-item tables.
- Read the structured resultEach value returns with its bbox, vertices, match_ratio, and bbox_source, plus a field_bboxes map locating every field on the page — full-width and vertical text included.
- Verify anythingClick a cell to highlight the exact region it was read from; a match_ratio below 0.85 flags a value worth a closer look. Edits are stored beside the original OCR value.
- Export or queryDownload CSV (UTF-8 BOM so Japanese opens cleanly, line items unfolded) or query a stored sheet with GET /view using where, sort, and select — no re-OCR, no extra charge.
Simple, predictable pricing
Pay $0.05 per image (¥10 / ₩100), with a free tier of 100 scans a month and no credit card. Flat plans add monthly scans, more sheets, and storage.
Do I have to tell it the document is in Japanese?
Does it handle full-width characters and vertical text?
Will Japanese text survive the CSV export, or turn into mojibake?
Does Japanese OCR keep the location of each value?
Which Japanese documents can it read?
How much does Japanese OCR cost?
Turn your own Japanese documents into checkable data
Free tier — 100 scans a month, no credit card. Every value comes back with its on-page location.