I convert native or scanned PDFs into Excel/CSV/JSON with OCR, preserve clean structure for tables and text, and optionally apply light remediation (reading order, headings, bookmarks/TOC, basic alt text on key images).
Work is priced per page with transparent milestones on Guru’s SafePay.
What I deliver
Structured outputs: XLSX/CSV/JSON (schema-aligned if provided).
Clean tables (rows/columns aligned), normalized numbers/dates.
OCR for scans; selectable text and searchable content.
Light remediation add-on: reading order, H1–H3 headings, bookmarks/TOC, basic alt text (simple docs).
QA pack: spot checks, totals cross-checks, and a small manifest with file hashes on request.
How I work
You share sample pages + target format (or a schema).
I send a per-page quote and set milestones (e.g., 20 pages each).
I process with OCR + parsing + validation; we review the first milestone.
I deliver final files and QA notes; you approve the milestone and we continue.
Use cases
Statements, invoices, forms, research papers, reports.
PDF tables to Excel; PDF text to clean Markdown/Word.
Tools
OCR & parsing (e.g., Tesseract/ABBYY), table extraction (e.g., pdfplumber/Tabula), data checks (Python/pandas).
Light vs. full remediation
Included on request: light remediation for simple docs (reading order, headings, bookmarks, basic alt text).
Not included (quote only): full PDF accessibility (PDF/UA/WCAG), advanced forms, complex tables (multi-headers/merged), heavy graphics, math, handwriting.
What I need from you