Chapter 20 Machine learning technicques

20.1 Gradient descent

20.1.1 Vector functions

Exercise 20.1

Consider a function \(f(\boldsymbol{x}) = e^{-\boldsymbol{x} \cdot \boldsymbol{x}'}\) where \(\boldsymbol{x}' = (x_{1}, x_{2})\).

Define the function in python. It’s argument should be a single vector \(\boldsymbol{x}\), and it should do the calculations in vector form.
Compute the function value at \(\boldsymbol{x}_1 = (0, 1)'\), \(\boldsymbol{x}_2 = (1, 0)'\) and \(\boldsymbol{x}_3 = (1, 1)'\). Use column vectors as input!
Compute (analytically, on paper) the gradient of the function. You may want to do it in non-vector form, writing the function as \(f(x_{1}, x_{2}) = e^{-(x_{1}^{2} + x_{2}^{2})}\). Compute the gradient, and transform it back into vector form!
Define the gradient function in python. It should have a single argument, the vector \(\boldsymbol{x}\), and it should return a vector, the gradient value.
Use the function to compute the gradient value at \(\boldsymbol{x}_{1} = (0, 1)'\), \(\boldsymbol{x}_{2} = (1, 0)'\) and \(\boldsymbol{x}_{3} = (1, 1)'\).
Can this function compute gradient value at \(\boldsymbol{x}_{1} = (0, 1, 2, 3)'\)? Explain, and try!

The solution

20.1.2 Visualizing the function and gradient

One can visualize functions that have 2-D vector argument. Let’s do it with the function

def f(x):
    z = np.exp(-x.T @ x)
    return z

def grad(x):
    g = -2 * x * np.exp(-x.T @ x)
    return g

def showFG(f, grad = None,
           levels = (0.01, 0.033, 0.1, 0.33),
           x1range = (-2, 2), x2range=(-2, 2)):
    nGrid = 26
    ex1 = np.linspace(x1range[0], x1range[1], nGrid)
    ex2 = np.linspace(x2range[0], x2range[1], nGrid)
    (xm1, xm2) = np.meshgrid(ex1, ex2)
    x1 = xm1.ravel()
    x2 = xm2.ravel()
    z = np.zeros_like(xm1)
    g1 = np.zeros_like(xm1)
    g2 = np.zeros_like(xm1)
    for i in range(nGrid):
        for j in range(nGrid):
            x = np.array([[xm1[i,j]], [xm2[i,j]]])
            z[i,j] = f(x)[0,0]
            if grad is not None:
               g = grad(x)
               g1[i,j] = g[0,0]
               g2[i,j] = g[1,0]
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.set_aspect('equal')
    cs = ax.contour(ex1, ex2, z, levels = levels)
    ax.clabel(cs, inline=1, fontsize=10)
    ax.quiver(ex1, ex2, g1, g2, width=0.002, headwidth=5)
    ## Print suggested levels, in case it is hard to know what is good
    print("Suggested levels:")
    print(np.percentile(z, (20, 40, 60, 80)))

showFG(f, grad)

## Suggested levels:
## [0.01147941 0.03540862 0.10921903 0.3368896 ]

plot of chunk unnamed-chunk-2