From PDF Chaos to Structured Data in Seconds
You have 200 supplier invoices on your desk. Each one has a different format. Normally, this means a full day of manual data entry. With DataUnchain, it takes 3 minutes.
Step 1: Scan and drop
Your scanner outputs PDFs to a network folder. DataUnchain's Watchdog service detects each new file instantly. Multi-page PDFs are automatically split into individual page images.
Step 2: AI reads each page
Qwen 3.5 VL analyses each page image. Unlike OCR, it understands the document — it knows where the invoice number is, where the totals are, and can read handwritten notes next to the line items.
You've configured the extraction prompt once:
vat_id, subtotal, vat, total, line_items.
Reply in JSON."
Step 3: Math validation
For each invoice, Python checks: subtotal + vat == total. If it doesn't match within a 2-cent tolerance, the record is flagged NEEDS_REVIEW instead of VALIDATED.
Out of 200 invoices, typically 3–5 get flagged — either because the AI misread a digit, or because the original invoice actually has an error.
Step 4: Clean export
All 200 invoices are now in PostgreSQL. Export to Excel with one click. Upload to your accounting software. Done.
Total time: 3 minutes instead of 8 hours.