Workflow
Quantization
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-08-14 06:33
voyage-context-3 supports 2048, 1024, 512, and 256 dimensions with quantization.Compared to OpenAI-v3-large (float, 3072d), voyage-context-3 (int8, 2048):- delivers 83% lower vector DB costs- provides 8.60% better retrieval qualityCheck this 👇 https://t.co/OqBhucXCN5 ...
360Brew: LLM-based Personalized Ranking and Recommendation - Hamed and Maziar, LinkedIn AI
AI Engineer· 2025-07-16 17:59
[Music] Hi everyone. Very excited to be here and I'm Ahmed. This is Mazar.And uh today uh uh we're going to talk about our journey in leveraging large language models for personalization and ranking u and our path to production such a large model for uh for LinkedIn use cases. Oop uh recommendation ranking and personalization is deeply integrated our daily life. uh when you go to a feed to to read an article, when you're looking for a for a job, when you're searching for something, when you're buying someth ...
X @Avi Chawla
Avi Chawla· 2025-06-11 06:30
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):A great tool to estimate how much VRAM your LLMs actually need.Alter the hardware config, quantization, etc., and get to know about:- Generation speed (tokens/sec)- Precise memory allocation- System throughput, etc.No more VRAM guessing! https://t.co/lZbIink12f ...
X @Avi Chawla
Avi Chawla· 2025-06-11 06:30
A great tool to estimate how much VRAM your LLMs actually need.Alter the hardware config, quantization, etc., and get to know about:- Generation speed (tokens/sec)- Precise memory allocation- System throughput, etc.No more VRAM guessing! https://t.co/lZbIink12f ...