Gradient descent algorithm


1 min read

I started the implementation of ML algorithms from scratch and compare them with the implementation of scikit-learn library.

The coefficients w and b go down to convergence, and to calculate this, we need gradient descent.

def gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function):
    b = b_in
    w = w_in

    for i in range(num_iters):
        dj_dw, dj_db = gradient_function(x, y, w, b)
        b = b - alpha * dj_db
        w = w - alpha * dj_dw
    return w, b

x: x_train array

y: y_train array

w_in: initial w

b_in: initial b

alpha: Learning rate

num_iters: number of iterations to run gradient descent

cost_function: How much the formula is different from each data point

gradient_function: function to produce gradient dj_dw and dj_db

Here is the test code.

w_init = 0
b_init = 0
iterations = 10000
tmp_alpha = 1.0e-2
w_final, b_final = gradient_descent(x_train, y_train, w_init, b_init, tmp_alpha, iterations, compute_cost, compute_gradient)
print(f"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})")

The output is this.

(w,b) found by gradient descent: (200.0000, 99.9999)

It seems like this code is working properly.