X @Avi Chawla
Avi Chawla·2025-10-22 06:31
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):Most LLM-powered evals are BROKEN!These evals can easily mislead you to believe that one model is better than the other, primarily due to the way they are set up.G-Eval is one popular example.Here's the core problem with LLM eval techniques and a better alternative to them: https://t.co/izhjUEEipI ...