Workflow
Parameter-efficient finetuning (PEFT)
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-12-04 19:38
LLM Fine-tuning Techniques - Traditional fine-tuning is impractical for LLMs due to the large number of parameters (billions) and data size (hundreds of GBs), leading to the development of parameter-efficient fine-tuning (PEFT) [1] - PEFT techniques involve finding a lower-rank adaptation of LLM weight matrices [2] Specific PEFT Techniques - **LoRA (Low-Rank Adaptation):** Adds two low-rank trainable matrices (A and B) alongside weight matrices, adjusting updates in these low-rank matrices instead of fine-tuning the original weights, significantly reducing memory usage [3] - **LoRA-FA (Frozen-A):** Freezes matrix A in LoRA and only updates matrix B, further reducing activation memory requirements [4] - **VeRA:** Freezes matrices A and B, sharing them across all layers, and learns layer-specific scaling vectors instead [4] - **Delta-LoRA:** Tunes the original weight matrix W by adding the difference (delta) between the product of matrices A and B in two consecutive training steps [4][5] - **LoRA+:** Sets a higher learning rate for matrix B compared to matrix A in LoRA, resulting in better convergence [6]
X @Avi Chawla
Avi Chawla· 2025-12-04 06:30
I have been fine-tuning LLMs for over 2 years now!Here are the top 5 LLM fine-tuning techniques, explained with visuals:First of all, what's so different about LLM finetuning?Traditional fine‑tuning is impractical for LLMs (billions of params; 100s GB).Since this kind of compute isn't accessible to everyone, parameter-efficient finetuning (PEFT) came into existence.Before we go into details of each technique, here's some background that will help you better understand these techniques:LLM weights are matric ...