Introducing GPT-5.5 with Databricks

GPD 5.5% in the agent harness setting has a 46% reduction in errors compared to 5.4% and is the only model in the agent hardness setting that is getting above 50% on the benchmark. Office QA serves as this proxy for what customer workflows will be at data bricks. Customers will often come to us with really messy looking documents.We rely on custom parsing at data bricks and having these multi- aent setups that can perform parsing within their agent harnesses. Codeex with 5.5% is now currently state-of-the-a ...