Co-distillation
Search documents
X @Avi Chawla
Avi Chawla· 2026-03-05 20:00
RT Avi Chawla (@_avichawla)You're in a Research Scientist interview at DeepMind.The interviewer asks:"Our investors want us to contribute to open-source.Gemini crushed benchmarks.But we'll lose competitive edge by open-sourcing it.What to do?"You: "Release a research paper."Here's what you missed:LLMs today don't just learn from raw text; they also learn from each other.For example:- Llama 4 Scout & Maverick were trained using Llama 4 Behemoth.- Gemma 2 and 3 were trained using Gemini.Distillation helps us ...
X @Avi Chawla
Avi Chawla· 2026-03-05 06:31
You're in a Research Scientist interview at DeepMind.The interviewer asks:"Our investors want us to contribute to open-source.Gemini crushed benchmarks.But we'll lose competitive edge by open-sourcing it.What to do?"You: "Release a research paper."Here's what you missed:LLMs today don't just learn from raw text; they also learn from each other.For example:- Llama 4 Scout & Maverick were trained using Llama 4 Behemoth.- Gemma 2 and 3 were trained using Gemini.Distillation helps us do so, and the visual expla ...
X @Avi Chawla
Avi Chawla· 2025-09-29 19:20
RT Avi Chawla (@_avichawla)You're in a Research Scientist interview at OpenAI.The interviewer asks:"Our investors want us to contribute to open-source.o3 crushed benchmarks.But we can lose a competitive edge by open-sourcing it.What do we do?"You: "Release the research paper."Interview over.You forgot that LLMs don't just learn from raw text; they also learn from each other.For example:- Llama 4 Scout & Maverick were trained using Llama 4 Behemoth.- Gemma 2 and 3 were trained using Gemini.Distillation helps ...
X @Avi Chawla
Avi Chawla· 2025-09-29 06:33
You're in a Research Scientist interview at OpenAI.The interviewer asks:"Our investors want us to contribute to open-source.o3 crushed benchmarks.But we can lose a competitive edge by open-sourcing it.What do we do?"You: "Release the research paper."Interview over.You forgot that LLMs don't just learn from raw text; they also learn from each other.For example:- Llama 4 Scout & Maverick were trained using Llama 4 Behemoth.- Gemma 2 and 3 were trained using Gemini.Distillation helps us do so, and the visual e ...