Semantic Formatting
Search documents
Deep Dive into Semantic Formatting Score: A New Metric for Meaningful Document Formatting
LlamaIndex· 2026-04-28 03:05
A price tag shows $10 and then crossed out to $4. Your parser outputs both prices as plain texts next to each other. Now the agent thinks there are two valid prices.Which one should it be using. I'm Simon from Llama Index. Most document OCR benchmarks completely ignore text formatting.They strip it before evaluation treating it as cosmetic. But in parsbench, the first document OCR benchmark for AI agents, we introduced a semantic formatting score because formatting carries meaning. Striketh through marks de ...