X @Avi Chawla - Reportify

Use momentumIn gradient descent, every parameter update solely depends on the current gradient. This leads to unwanted oscillations during optimization.Momentum reduces this by adding a weighted average of previous gradient updates to the update rule.Check this 👇 https://t.co/77X9rwRyOF ...