X @Avi Chawla
RT Avi Chawla (@_avichawla)A simple technique trains neural nets 4-6x faster!- OpenAI used it in GPT models.- Meta used it in LLaMA models.- Google used it in Gemini models.Here's a breakdown (with code): ...
RT Avi Chawla (@_avichawla)A simple technique trains neural nets 4-6x faster!- OpenAI used it in GPT models.- Meta used it in LLaMA models.- Google used it in Gemini models.Here's a breakdown (with code): ...