This. Any complex parent table span cell relationship still has low accuracy. Tr...

bobsmooth · 2025-10-20T08:59:00 1760950740

Maybe I misunderstood the assignment but it seems to work for me.

https://chatgpt.com/share/68f5f9ba-d448-8005-86d2-c3fbae028b...

Edit: Just caught a mistake, transcribed one of the prices incorrectly.

kbumsik · 2025-10-20T09:13:37 1760951617

Right, I wouldn't use full table detection to VLM model because they tend to mistake with numbers in table...

pietz · 2025-10-20T09:40:12 1760953212

Maybe my imagination is limited or our documents aren't complex enough, but are we talking about realistic written documents? I'm sure you can take a screenshot of a very complex spreadsheet and it fails, but in that case you already have the data in structured form anyway, no?

kbumsik · 2025-10-20T10:39:02 1760956742

> realistic written documents?

Just get a DEF 14A (Annual meeting) filing of a company from SEC EDGAR.

I have seen so many mistakes when looking at the result closely.

Here is a DEF 14A filing from Salseforce. You can print it to a PDF and then try converting.

https://www.sec.gov/Archives/edgar/data/1108524/000110852425...

grosswait · 2025-10-20T12:34:18 1760963658

Historical filings are still a problem, but hasn’t the SEC required filing in an XML format since the end of 2024?

richardlblair · 2025-10-20T13:28:58 1760966938

It's not really about SEC filings, though. While we folks on HN would never think of hard copies of invoices, but much of the world still operates this way.

As mentioned above I have about 200 construction invoices. They are all formatted in a way that doesn't make sense. Most fail both OCR and OpenAI

KoolKat23 · 2025-10-20T16:13:42 1760976822

OpenAI has unusuably low image DPI. Try Gemini.

daemonologist · 2025-10-20T14:17:08 1760969828

Now if someone mails or faxes you that spreadsheet? You're screwed.

Spreadsheets are not the biggest problem though, as they have a reliable 2-dimensional grid - at worst some cells will be combined. The form layouts and n-dimensional table structures you can find on medical and insurance documents are truly unhinged. I've seen documents that I struggled to interpret.

KoolKat23 · 2025-10-20T16:03:42 1760976222

To be fair, this is problematic for humans too. My old insurer outright rejected things like that stating it's not legible.

(I imagine it also had the benefit of reducing fraud/errors).

In this day and age, it's probably easier/better to change the process around that as there's little excuse for such shit quality input. I understand this isn't always possible though.