Workflow
The Wild World of AI: 6 Months That Changed Everything
AI Engineerยท2025-07-10 03:23

There are all of these benchmarks full of numbers. I don't like the numbers. There are the leaderboards.I'm kind of beginning to lose trust in the leaderboards as well. So for my own work, I've been leaning increasingly into my own little benchmark, which started as a joke and has actually turned into something that I I rely on quite a lot. And that's this.I prompt models with generate an SVG of a pelican riding a bicycle. I have good reasons for this. Um firstly, these are not image models. These are text ...