- Gradient descent
- Gauss-Newton algorithm
- Conjugate gradient method
Gradient descent is an iterativ Algorithm to find a local optimum (minimum) of a differentiable multivariate function.
Let
with an startvalue
The stepsize
f(
f(
with
beta = 0.9
delta = 10e-4
while f(x_k - tau * f_grad(x_k)) - f(x_k) > delta * tau * f_grad(x_k).T * (-f_grad(x_k)):
tau = beta * tau
With this knowledge, we are ready, to implement the full gradient descent algorithm (with line search).
for _ in range(max_iterations):
tau = 1
while f(x_k - tau * f_grad(x_k)) - f(x_k) > delta * tau * f_grad(x_k).T * (-f_grad(x_k)):
tau = beta * tau
x_k = x_k - tau * f_grad(x_k)
Newtons method is and iterativ Algorithm (like Gradient descent) to find a local optimum. It incorporates not only the first derivative but also the second derivative in its calculations.
Let
Note: We use the inverse of the Hessian Matrix. We can use Armijo- and Wolfe-Conditions, to make the algorithm more efficient.
The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values.
m_b = np.array([m0, b0])
for _ in range(max_iter):
jac = jacobian(x, m_b[0], m_b[1])
loss = f_loss(x, m_b[0], m_b[1], y)
new_m_b = m_b + np.linalg.inv(jac.T@jac)@jac.T@loss
if np.linalg.norm(m_b - new_m_b) < tol:
break
return new_m_b