OCR Showdown: Cardinal vs. the Rest

Book Demo

OCR Showdown: Cardinal vs. the Rest

Jianna Liu

•

Jul 19, 2025

You asked, and we answered!!

Everyone’s been asking: how does Cardinal actually stack up against the big names? So we put it to the test.

In this showdown, we pit Cardinal against the most-requested providers - from legacy OCR engines like Tesseract and Textract, to modern LLMs like GPT-4, Claude, and Gemini. To keep it fair, we ran all of them against the same three notoriously tricky images:

Handwriting + Annotations → This sample throws all the tricks in one: handwritten notes, annotations, and tables. I’m going to be looking at how well each system handles the handwriting, while still keeping the surrounding structure intact.

Complex Tables → This one is brutal. The document has multiple spanning columns that stretch across other columns, which completely throws off most systems.

Checkmarks → This one is particularly tricky because they’re filled-in boxes, which makes them especially hard for traditional OCR systems to recognize. Azure, for example, trained its checkbox model to look for “closed” objects - but these don’t fit that pattern at all, so the system misses them completely.

Here's the Showdown

Provider	Pros	Cons (compared to Cardinal)	Provider Results
Gemini 2.5 Pro	Stronger on handwriting vs. other LLMs	❌ No bounding boxes ❌ Bad with complex tables ❌ Hallucinates ❌ Breaks on long docs ❌ Expensive at scale	Handwriting ✅ (missing checkmarks) Tables ❌ Annotations ❌
GPT-5	Fairly good on standard docs, easy to use	❌ Bad with tricky tables/handwriting ❌ Very high latency ❌ No bounding boxes ❌ Expensive	Handwriting ❌ Tables ❌ Annotations ❌
Claude 4 Sonnet	Can return HTML output	❌ Very expensive ❌ High hallucination rate ❌ No valid bounding boxes ❌ Bad with complex tables	Handwriting ❌ Tables ❌ Annotations ❌
Microsoft Azure	More mechanical, less chance of hallucination	❌ Does very poorly with handwriting ❌ Struggles with scanned docs	Handwriting ❌ Tables ✅ Annotations ❌
Textract	More mechanical, less chance of hallucination	❌ Poor on irregular documents ❌ Fails on scanned + handwritten docs	Handwriting ❌ Tables ❌ Annotations❌
Mistral OCR	Low cost	❌ Failed on most test docs ❌ Poor on handwriting & complex tables ❌ Often “hangs” mid-doc	Handwriting ❌ Tables ❌ Annotations❌
Tesseract	Free	❌ Failed on everything!	Handwriting ❌ Tables ❌ Annotations ❌

Conclusion

At the end of the day, this showdown makes one thing crystal clear: OCR is far from a solved problem. Legacy engines can’t handle today’s document complexity, and even the most advanced LLMs stumble on structure, hallucinate, or fail at scale.

That’s why Cardinal exists. We’re built to handle the messy reality of documents - whether it’s filled checkboxes, sprawling tables, or handwritten annotations - while keeping accuracy, structure, and cost efficiency intact.

If you’ve been wondering whether OCR can really keep up with your hardest documents, the results here speak for themselves.

👉 Check out the scripts and try it yourself - we’d love to see what you throw at Cardinal.

‹ OCR Is Broken for Complex Documents - Here’s How We Fixed It

Footer-details

team@trycardinal.ai

logo-xl