OCR Showdown: Cardinal vs. the Rest

Jianna Liu

Jul 19, 2025

cardinal-logo
cardinal-logo
cardinal-logo

You asked, and we answered!!

Everyone’s been asking: how does Cardinal actually stack up against the big names? So we put it to the test.

In this showdown, we pit Cardinal against the most-requested providers - from legacy OCR engines like Tesseract and Textract, to modern LLMs like GPT-4, Claude, and Gemini. To keep it fair, we ran all of them against the same three notoriously tricky images:

  • Handwriting + Annotations → This sample throws all the tricks in one: handwritten notes, annotations, and tables. I’m going to be looking at how well each system handles the handwriting, while still keeping the surrounding structure intact.

  • Complex Tables → This one is brutal. The document has multiple spanning columns that stretch across other columns, which completely throws off most systems.

  • Checkmarks → This one is particularly tricky because they’re filled-in boxes, which makes them especially hard for traditional OCR systems to recognize. Azure, for example, trained its checkbox model to look for “closed” objects - but these don’t fit that pattern at all, so the system misses them completely.

Here's the Showdown

Provider

Pros

Cons (compared to Cardinal)

Provider Results

Gemini 2.5 Pro

Stronger on handwriting vs. other LLMs

❌ No bounding boxes

❌ Bad with complex tables

❌ Hallucinates

❌ Breaks on long docs

❌ Expensive at scale

Handwriting ✅ (missing checkmarks)

Tables

Annotations

GPT-5

Fairly good on standard docs, easy to use

❌ Bad with tricky tables/handwriting

❌ Very high latency

❌ No bounding boxes

❌ Expensive

Handwriting

Tables

Annotations

Claude 4 Sonnet

Can return HTML output

❌ Very expensive

❌ High hallucination rate

❌ No valid bounding boxes

❌ Bad with complex tables

Handwriting

Tables

Annotations

Microsoft Azure

More mechanical, less chance of hallucination

❌ Does very poorly with handwriting

❌ Struggles with scanned docs

Handwriting ❌ 

Tables

Annotations

Textract

More mechanical, less chance of hallucination

❌ Poor on irregular documents

❌ Fails on scanned + handwritten docs

Handwriting

Tables ❌ 

Annotations

Mistral OCR

Low cost

❌ Failed on most test docs

❌ Poor on handwriting & complex tables

❌ Often “hangs” mid-doc

Handwriting

Tables

Annotations❌ 

Tesseract

Free

❌ Failed on everything!

Handwriting

Tables

Annotations ❌ 

Conclusion

At the end of the day, this showdown makes one thing crystal clear: OCR is far from a solved problem. Legacy engines can’t handle today’s document complexity, and even the most advanced LLMs stumble on structure, hallucinate, or fail at scale.

That’s why Cardinal exists. We’re built to handle the messy reality of documents - whether it’s filled checkboxes, sprawling tables, or handwritten annotations - while keeping accuracy, structure, and cost efficiency intact.

If you’ve been wondering whether OCR can really keep up with your hardest documents, the results here speak for themselves.

👉 Check out the scripts and try it yourself - we’d love to see what you throw at Cardinal.