AI document extraction pipeline for logistics and distribution. Automates the manual data entry of paper-based delivery notes, invoices, and shipping documents — turning unstructured scanned documents into structured database records.
Designed for distribution operations processing thousands of delivery notes per week.
Status: Public showcase. The two demos here (app/index.html and dashboard/index.html) are anonymized recreations of the real system I built under a client engagement. The original production code is proprietary and stays with the client; what you see here is the same architecture, the same UX flow, with all client identifiers replaced.
Distribution companies receive hundreds of paper delivery notes daily from drivers, suppliers, and couriers. Each note contains:
- Delivery address and contact
- Line items with product codes, quantities, weights
- Signatures, timestamps, special instructions
- Handwritten annotations
Manual data entry into the ERP takes a team of clerks several hours per day. Errors lead to billing discrepancies and stock mismatches.
DocFlow scans or photographs the documents, runs them through a GPT-4o Vision pipeline, extracts structured data, validates it against known product codes and addresses, and writes it directly to the operational database — with a human review queue for low-confidence extractions.
Physical document
│
📷 Photo / scan
│
▼
Cloudflare R2 ← upload via mobile app or web form
(document storage)
│
▼
Cloudflare Worker ← triggers on upload
(orchestration layer)
│
▼
GPT-4o Vision ← structured JSON extraction
(extraction model)
│
▼
Validation layer ← cross-checks against product catalogue + address book
(Supabase queries)
│
┌──┴──┐
│ │
✓ Pass ✗ Fail / low confidence
│ │
▼ ▼
Auto-write Human review queue
to ERP DB (Next.js dashboard)
Each document goes through a multi-step extraction:
Step 1 — Pre-processing The image is resized and contrast-adjusted for OCR readability. Documents with multiple pages are split and processed individually.
Step 2 — GPT-4o Vision call The model receives the image with a structured extraction prompt requesting a typed JSON response:
type DeliveryNoteExtraction = {
document_type: "delivery_note" | "invoice" | "return_note" | "unknown";
document_number: string | null;
date: string | null; // ISO 8601
supplier: { name: string; vat_number: string | null };
delivery_address: { street: string; postal_code: string; city: string };
line_items: Array<{
product_code: string | null;
description: string;
quantity: number;
unit: string;
weight_kg: number | null;
}>;
signature_present: boolean;
notes: string | null;
confidence: number; // 0-1, model's self-assessed confidence
};Step 3 — Validation Extracted product codes are matched against the product catalogue in Supabase. Addresses are fuzzy-matched against the known delivery address book. Mismatches or low-confidence fields are flagged for human review.
Step 4 — Write or queue Documents with confidence ≥ 0.85 and all fields validated are written directly to the ERP integration table. Others go to the review queue.
A Next.js dashboard shows the review queue with:
- Side-by-side view: original document image + extracted JSON
- Inline editing for any field
- "Accept" / "Reject" / "Re-extract" actions
- Audit trail of every edit
Reviewers process the queue in the morning for documents received overnight.
The validated records are written to a staging table in Supabase. A scheduled job (Supabase pg_cron) exports the staged records to the client's ERP via a REST adapter, running every 2 hours during business hours.
| Layer | Technology |
|---|---|
| Storage | Cloudflare R2 (document images) |
| Edge | Cloudflare Workers (upload handler, orchestration) |
| AI | GPT-4o Vision (OpenAI) |
| Database | Supabase (PostgreSQL) |
| Review dashboard | Next.js 14 (App Router) |
| ERP connector | ERP REST API (custom adapter) |
| Language | TypeScript |
GPT-4o Vision over traditional OCR — classic OCR (Tesseract, etc.) struggles with handwriting, poor scan quality, and mixed layouts. GPT-4o Vision handles all three and returns structured JSON directly, skipping the intermediate parse step.
Confidence-based routing — not all extractions are equal. A simple threshold (0.85) routes clean documents to auto-write and problematic ones to human review. This keeps the review queue manageable (≈5% of documents) while automating the bulk.
Cloudflare R2 + Workers — documents are uploaded directly to R2 from mobile devices in the field. The Worker triggers immediately on upload, keeping latency low without a polling loop or message queue for the typical volume.
Staging table → ERP — writing directly to the ERP API on every extraction would be brittle. The staging table decouples extraction from ERP sync, making retries, rollbacks, and debugging straightforward. The ERP adapter is the only component that knows about the target ERP system.
Supabase for everything except storage — product catalogue, address book, review queue, audit log, and ERP staging all live in Postgres. Using a single database keeps joins simple and avoids distributed transactions.
- ~95% of documents processed automatically without human review
- Data entry time reduced from ~4 hours/day to ~20 minutes (review queue only)
- Error rate in ERP records dropped from ~3% (manual) to ~0.3% (pipeline)
docflow/
├── README.md ← this file (architecture documentation)
├── app/
│ └── index.html ← the field-app demo: capture a document,
│ extract structured data, send to dashboard.
│ Single-file standalone — open in a browser.
└── dashboard/
└── index.html ← the operator review dashboard: queue of
pending docs, approve / reject / send-to-ERP
actions, SAP integration log, analytics.
Both demos are single-file HTML. No build step, no dependencies, no API keys needed.
# Field-app demo
open app/index.html
# Dashboard demo
open dashboard/index.htmlThe dashboard listens for new documents from the field-app via localStorage (they communicate when both open in the same browser). Open both, click "📱 Simular doc. da app" in the dashboard or use the field-app to send a new document, and watch the flow.
All client-identifying data has been replaced with generic placeholders:
- Supplier/brand names → "Acme Bebidas"
- Real NIFs → fake NIFs in the 500 100 100 format
- Geographic markers → generic "Lisboa, Portugal"
- Real operator name → "João Silva"
- Cost-center codes → generic "ACME-LOG-001"
The architecture, the UX flow, the SAP integration shape, the field-to-dashboard handoff, the OCR-vs-LLM fallback strategy — all real, all preserved.
This repository is published as a portfolio showcase of work I did under a real client engagement. The code is not licensed for reuse, redistribution, or modification. It is provided here for review purposes only. If you'd like to discuss similar work for your own organisation, get in touch.
Built by Miguel Borges · hello@miguelborges.dev