Here we present some notes about the multigrid method.
The mmf.math.multigrid module provides efficient ways of solving Poisson-type equations:
The geometric version of the problem presents a discretization of the left-hand side on a sequence of finer grids.
Here we implements some simple multigrid solvers. These solve a series of related linear problems
where the matrices have special properties, typically representing derivative operators such as the Laplacian on a series of grids of increasing size.
Formally, the problem requires a set of three operators:
- : A restriction operator that
- takes a right hand side of the problem on a finer grid into a right-hand side on a coarser grid.
- : A smoothing operator that removes
- high-frequency components of the error from the solution.
- : An interpolation operator that
- takes a solution on a coarser grid into a solution on the finer grid.
Typically these are linear operators (and we shall consider only linear operators here) and should satisfy the Galerkin condition:
which serves to define the coarse grid operators in terms of the finer grid operator. A second condition is usually imposed that the interpolation and restriction operations be related by transposition:
This helps with the analysis by defining a variational property:
Note
I think this states that the solution to the coarse problem provides the minimum error solution to the fine problem when interpolated. In other words, if we only consider solutions of the form , then the best solution (in the sense of minimizing the error ) in this subspace will be the interpolation of the solution to the coarsened problem. However, I can’t prove this right now.
As we shall see, the Galerkin condition is not always easily maintained if efficient matrix operation is desired.
The smoothing operator must provide a relaxation method for the problem. The efficiency of the multigrid method is based on each of these operations being able to be applied in time where is the number of unknowns. These are typically just weighted averages of nearest neighbours respecting this requirement.
The idea is for one to use to damp the high frequency errors, then to recursively apply the algorithm using and to go to a coarser grid. In the coarser grid, the low-frequency errors become high frequency errors which are again damped. Typically, this is accomplished with a weighted Jacobi algorithm or red-black ordered Gauss-Seidel. The idea is to form an update
that is idempotent on the solution . This ensures that the errors grow as . This, then eigenvalues of control the convergence: thus one chooses the iteration in such a way that the eigenvalues of high-frequency components are substantially between 1 and -1. The choice of is discussed in detail in the section Smoothing below.
The first step, however, is to consider the discretization of the Poisson operator . This should have several properties:
As the Laplacian is a symmetric operator (in the sense that all the eigenvalues are real), the discretization should preserve this property. This requires a fairly careful implementation of the boundary conditions.
The Laplacian should preserve charge neutrality with applicable boundary conditions: the finite version of integration-by-parts:
Thus, with Neumann boundary conditions or without a boundary (for example, if one has periodic boundary conditions), for any vector one should have which means . (For Dirichlet boundary conditions, one must include the boundary contribution and so this relationship will not hold at the boundaries).
In addition, the equation will have a solution iff . This means that must have a singular direction. This must be considered when solving such systems.
The second question is the order of the discretization. Apparently, for the multigrid method, higher order is not very helpful, so we content ourselves with the symmetric differences scheme.
We consider the following operators for the second derivatives which satisfy these properties. Each of these can be expressed as in terms of forward and backward difference operators:
We must, however, determine the consistent abscissa. This is done by consider the plane-wave wavefunctions . For intermediate points with spacing we immediately find the eigenvalues
We now consider the endpoints. Let the left endpoint be at , we then have the following condition obtained at the left endpoint with for Dirichlet and for Neumann boundary conditions respectively (the periodic condition place no restrictions):
A consistent discretization will have a solution for all which leads to the following consistency conditions:
Here we have , which is always satisfied if :
o x x x ... x o
|----|----|----|-- ... --|----|
0 1d 2d 2d ... Nd L
Hence, given the interval with the Dirichlet conditions , the discretization is valid for the lattice points
Here we have , which is always satisfied if :
o x x x ... x x o
|--|----|----|-- ... |----|--|
0 d/2 3d/2 5d/2 ... L
Here is a little check of the Neumann case. We first form the eigenvectors and eigenvalues, and then show that we recover the desired form of operator:
>>> L = 1
>>> N = 4
>>> f = 0.5
>>> d = L/(N - 1 + 2*f)
>>> x0 = f*d
>>> x = np.ogrid[x0:L-x0:N*1j]
>>> n = np.arange(N)
>>> k = pi*n/L
>>> V = np.cos(k[None,:]*x[:,None])
>>> V = 1/np.sqrt(np.diag(np.dot(V.T, V)))*V
>>> E = 2*(cos(k*d) - 1).ravel()
>>> T = np.dot(np.dot(V, diag(E)), V.T).round(2)
>>> T
array([[-1., 1., -0., 0.],
[ 1., -2., 1., -0.],
[-0., 1., -2., 1.],
[ 0., -0., 1., -1.]])
One difficulty with the Neumann boundary conditions is that they don’t lend themselves to trivial restriction and interpolation: There is no way of removing lattice sites while preserving the ratio . One has several options:
Choose a different parametrization of that lends itself to simple restriction and interpolation. One way to do this is to include the endpoint as part of the grid (i.e. ). The discretization is then
Unfortunately this fails both the symmetry and neutrality criteria.
Keep the current form of diagonalization but choose a modified restriction and interpolation to preserve the grid properties. The complication here is that the lattice points of the coarser grid will not coincide with those of finer grids.
Keep the current form of diagonalization and the trivial restriction and interpolation and hope that the method will work. This may still work if we impose the Galerkin principle. Let’s try this in one dimension on the following problem:
These may be considered as a test problem with periodic boundary conditions on or with Neumann boundary conditions on . We use the following refinement with grid-points:
| x x | n0 = 2
| x x x | 1 n0 + 1 = 3
| x x x x x | 2 n0 + 1 = 5
|x x x x x x x x x| 4 n0 + 1 = 9
0-----------------L
Note that the length of the grid varies from level to level: our hope is that enforcing the Galerkin principle will suffice to ensure good convergence behaviour. The nai{}ve interpolation and restriction operators preserve the form of and so will work with the first form of the problem.
These are probably not very good in that they do not preserve constants at the boundaries, but maybe the Galerkin principle will come to the rescue. We shall take the following problem as an e
Another complication arises with a non-trivial metric which we allow for the purpose of allowing distorted lattices. We allow the user to provide a matrix who’s columns are the lattice vectors. This gives rise to , the matrix of displacements (roughly, where is the number of segments. One must take care to include appropriately.)
Consider a 2d grid with basis vectors and :
Let be the operator about the point of interest so that
We have the following relationships valid to order (where we have used the fact that :
We can compute the Laplacian as follows:
Since the Laplacian is symmetric, . Furthermore, the multidimensional matrices may be constructed from the lower dimensional versions using the tensor product. Thus, we really only need a one dimensional stencil for and a two dimensional stencil for .
To be explicit, we compute all the two-dimensional stencils. Let us reshape the operator over the coordinates. It is customary to consider the following 9-point stencil representing the averaging of the surrounding cells about the point :
Denote the indices , then the symmetry and neutrality constraints have the form:
For example, in the interior, we have a single stencil for all , thus the conditions reduce to and . The constructions are linear, so we construct a “basis” of stencils representing the various combinations that satisfy these properties. Then the linear combinations required to compute will by construction also satisfy the properties. In the interior we have the stencils :
We note that, to order , there is also a “zero” stencil:
For the Neumann boundary conditions we have the following set of stencils (plus rotations) at the edges:
and at the corner we have:
Now we consider the boundary conditions . The most difficult are the lattice Neumann where we have at the point (see the discussion above about the 1d discretization) that . This gives us the additional relationship and so we have:
From this we can from the coordinate transformation :
The Laplacian is then
where the matrices are broadcast appropriately such that the summation is over the coordinate axes. This will preserved
Note
The Neumann boundary conditions will not be strictly preserved by this: the derivative will not vanish along the normal to the boundary – which is the usual condition – but will instead vanish along the lattice. This is actually what we want because we are use the Neumann cell to model an octant of the full unit cell in a periodic system.
We first consider the form:
Generally has eigenvalues
Note
The arbitrary metric and dimensionality is incorporated into the factor . To see this, consider that the trace sums over the dimensions of the Laplacian, with each dimension contributing .
import mmf.math.multigrid
mg = mmf.math.multigrid.MG(
d=1, n_0=1, n=5, L=0.1, boundary_conditions='periodic')
T = -mg.get_mats()[0][-1].toarray()
#T = -mg.to_mat(mg.x()[0], mg.T)
assert np.allclose(T,T.T)
E = np.linalg.eigvalsh(T)
dx_inv2 = np.trace(np.dot(mg.dx_inv().T,mg.dx_inv()))
plt.plot(np.arange(len(E),dtype=float)/len(E),
E/(4*dx_inv2),
'x',
label="d=%i" % (mg.d,))
mg.initialize(n=3, d=3, n_0=(1,1,1))
T = -mg.get_mats()[0][-1].toarray()
#T = -mg.to_mat(mg.x()[0], mg.T)
assert np.allclose(T,T.T)
E = np.linalg.eigvalsh(T)
dx_inv2 = np.trace(np.dot(mg.dx_inv().T,mg.dx_inv()))
plt.plot(np.arange(len(E),dtype=float)/len(E),
E/(4*dx_inv2),
'.',
label="d=%i" % (mg.d,))
plt.legend(loc='upper left')
plt.title("Eigenvalues of T")
plt.ylabel("Eigenvalue/($-4/\delta_x^2$)")
(Source code, png, hires.png, pdf)
In order to ensure idempotency, we want smoothing operation in the form of
where . This trivially satisfies the idempotency requirement. The goal is to choose the weight and the matrix (pre-conditioner) that is easy to compute (in terms of being of order cost to compute ) such that the eigenvalues of the corresponding to high frequency components have magnitude less than one. In the case of only a constant mass term we may neglect to obtain
Since our “multigrid” involves coarsening the grid by a factor of two, we wish to minimize the magnitude over the upper half of the frequencies, defining a fitness:
The extrema occur at the endpoints, and a little inspection tells us that the optimal weight must satisfy:
For reasonable problems, in the limit of large lattices, the terms dominate, and this approaches .
This suggests that as a general pre-conditioner we might use:
with a weight of . The more standard pre-conditioner would be
with as this corresponds directly to the diagonal of but it is not clear to me yet why this is better.
Finally, recall that we also consider the case where has an additional matrix structure so it is block diagonal in position space (rather than strictly diagonal). Here one has the choice of extracting the diagonal, or spending the extra work to diagonalize each of the blocks.