Chapter 20 Machine learning technicques
20.1 Gradient descent
20.1.1 Vector functions
Exercise 20.1
Consider a function \(f(\boldsymbol{x}) = e^{-\boldsymbol{x} \cdot \boldsymbol{x}'}\) where \(\boldsymbol{x}' = (x_{1}, x_{2})\).- Define the function in python. It’s argument should be a single vector \(\boldsymbol{x}\), and it should do the calculations in vector form.
- Compute the function value at \(\boldsymbol{x}_1 = (0, 1)'\), \(\boldsymbol{x}_2 = (1, 0)'\) and \(\boldsymbol{x}_3 = (1, 1)'\). Use column vectors as input!
- Compute (analytically, on paper) the gradient of the function. You may want to do it in non-vector form, writing the function as \(f(x_{1}, x_{2}) = e^{-(x_{1}^{2} + x_{2}^{2})}\). Compute the gradient, and transform it back into vector form!
- Define the gradient function in python. It should have a single argument, the vector \(\boldsymbol{x}\), and it should return a vector, the gradient value.
- Use the function to compute the gradient value at \(\boldsymbol{x}_{1} = (0, 1)'\), \(\boldsymbol{x}_{2} = (1, 0)'\) and \(\boldsymbol{x}_{3} = (1, 1)'\).
- Can this function compute gradient value at \(\boldsymbol{x}_{1} = (0, 1, 2, 3)'\)? Explain, and try!
20.1.2 Visualizing the function and gradient
One can visualize functions that have 2-D vector argument. Let’s do it with the function
def f(x):
= np.exp(-x.T @ x)
z return z
def grad(x):
= -2 * x * np.exp(-x.T @ x)
g return g
def showFG(f, grad = None,
= (0.01, 0.033, 0.1, 0.33),
levels = (-2, 2), x2range=(-2, 2)):
x1range = 26
nGrid = np.linspace(x1range[0], x1range[1], nGrid)
ex1 = np.linspace(x2range[0], x2range[1], nGrid)
ex2 = np.meshgrid(ex1, ex2)
(xm1, xm2) = xm1.ravel()
x1 = xm2.ravel()
x2 = np.zeros_like(xm1)
z = np.zeros_like(xm1)
g1 = np.zeros_like(xm1)
g2 for i in range(nGrid):
for j in range(nGrid):
= np.array([[xm1[i,j]], [xm2[i,j]]])
x = f(x)[0,0]
z[i,j] if grad is not None:
= grad(x)
g = g[0,0]
g1[i,j] = g[1,0]
g2[i,j] = plt.figure()
fig = fig.add_subplot(111)
ax 'equal')
ax.set_aspect(= ax.contour(ex1, ex2, z, levels = levels)
cs =1, fontsize=10)
ax.clabel(cs, inline=0.002, headwidth=5)
ax.quiver(ex1, ex2, g1, g2, width## Print suggested levels, in case it is hard to know what is good
print("Suggested levels:")
print(np.percentile(z, (20, 40, 60, 80)))
showFG(f, grad)
## Suggested levels:
## [0.01147941 0.03540862 0.10921903 0.3368896 ]

plot of chunk unnamed-chunk-2