Chapter 20 Machine learning technicques
20.1 Gradient descent
20.1.1 Vector functions
Exercise 20.1
Consider a function \(f(\boldsymbol{x}) = e^{-\boldsymbol{x} \cdot \boldsymbol{x}'}\) where \(\boldsymbol{x}' = (x_{1}, x_{2})\).- Define the function in python. It’s argument should be a single vector \(\boldsymbol{x}\), and it should do the calculations in vector form.
- Compute the function value at \(\boldsymbol{x}_1 = (0, 1)'\), \(\boldsymbol{x}_2 = (1, 0)'\) and \(\boldsymbol{x}_3 = (1, 1)'\). Use column vectors as input!
- Compute (analytically, on paper) the gradient of the function. You may want to do it in non-vector form, writing the function as \(f(x_{1}, x_{2}) = e^{-(x_{1}^{2} + x_{2}^{2})}\). Compute the gradient, and transform it back into vector form!
- Define the gradient function in python. It should have a single argument, the vector \(\boldsymbol{x}\), and it should return a vector, the gradient value.
- Use the function to compute the gradient value at \(\boldsymbol{x}_{1} = (0, 1)'\), \(\boldsymbol{x}_{2} = (1, 0)'\) and \(\boldsymbol{x}_{3} = (1, 1)'\).
- Can this function compute gradient value at \(\boldsymbol{x}_{1} = (0, 1, 2, 3)'\)? Explain, and try!
20.1.2 Visualizing the function and gradient
One can visualize functions that have 2-D vector argument. Let’s do it with the function
def f(x):
z = np.exp(-x.T @ x)
return z
def grad(x):
g = -2 * x * np.exp(-x.T @ x)
return g
def showFG(f, grad = None,
levels = (0.01, 0.033, 0.1, 0.33),
x1range = (-2, 2), x2range=(-2, 2)):
nGrid = 26
ex1 = np.linspace(x1range[0], x1range[1], nGrid)
ex2 = np.linspace(x2range[0], x2range[1], nGrid)
(xm1, xm2) = np.meshgrid(ex1, ex2)
x1 = xm1.ravel()
x2 = xm2.ravel()
z = np.zeros_like(xm1)
g1 = np.zeros_like(xm1)
g2 = np.zeros_like(xm1)
for i in range(nGrid):
for j in range(nGrid):
x = np.array([[xm1[i,j]], [xm2[i,j]]])
z[i,j] = f(x)[0,0]
if grad is not None:
g = grad(x)
g1[i,j] = g[0,0]
g2[i,j] = g[1,0]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_aspect('equal')
cs = ax.contour(ex1, ex2, z, levels = levels)
ax.clabel(cs, inline=1, fontsize=10)
ax.quiver(ex1, ex2, g1, g2, width=0.002, headwidth=5)
## Print suggested levels, in case it is hard to know what is good
print("Suggested levels:")
print(np.percentile(z, (20, 40, 60, 80)))
showFG(f, grad)
## Suggested levels:
## [0.01147941 0.03540862 0.10921903 0.3368896 ]

plot of chunk unnamed-chunk-2