.. _python_vect: ============================================================= Python vectorization ============================================================= As observed in :ref:`rotating`, the speed of Python code can often be increased greatly by vectorizing mathematical expressions that are applied to NumPy arrays rather than using loops. Here are a few more examples of how to do this. In all these examples, we use the array `x` defined by:: >>> import numpy as np >>> x = np.linspace(-2., 2., 5) >>> x array([-2., -1., 0., 1., 2.]) although the point of this vectorization is to make the expressions work well for *much* longer arrays. Many NumPy functions apply directly to arrays, e.g.:: >>> np.sin(x) array([-0.90929743, -0.84147098, 0. , 0.84147098, 0.90929743]) >>> np.abs(x) array([ 2., 1., 0., 1., 2.]) For function definitions where different formulas are applied over different values of `x`, the `np.where` function can generally be used. Consider the function :math:`f(x) = \left\{ \begin{array}{ll} x^2&~~x\leq 0.5\\ -x&~~x>0.5 \end{array}\right.` Computing `y=f(x)` at all points in the array `x` could be done by the loop:: y = np.zeros(x.shape) # initialize to shape of x for j in range(len(x)): if x[j] <= 0.5: y[j] = x[j]**2 else: y[j] = -x[j] This can be rewritten to be faster as:: y = np.where(x <= 0.5, x**2, -x) both should end up with:: >>> y [ 4. 1. 0. -1. -2.] Sometimes you may need to use `where` repeatedly, e.g. for the function :math:`f(x) = \left\{ \begin{array}{ll} x^2&~~x\leq 0.5\\ -x&~~0.5= 1.\end{array}\right.` the loop:: for j in range(len(x)): if x[j] <= 0.5: y[j] = x[j]**2 elif x[j]<1.5: y[j] = -x[j] else: y[j] = x[j]**4 can be replaced by:: y = np.where(x <= 0.5, x**2, -x) y = np.where(x < 1.5, y, x**4) Note that `y` is reused in the second statement. Either version should give:: >>> y array([ 4., 1., 0., -1., 16.]) Booleans for NumPy arrays ------------------------- Note that conditions like `x <= 0.5` in the above example actually return NumPy boolean arrays, e.g. :: >>> x <= 1.5 array([ True, True, True, True, False], dtype=bool) You can combine these using & for `and` and | for `or`, e.g. :: >>> (x >= 1) & (x <= 1.5) array([False, False, False, True, False], dtype=bool) it's best to put in parentheses to make sure such conditions are parsed properly. In fact if you leave them out above you will get an error message because & is the Python & that has higher precedence than <=, so :: >>> x >= 1 & x <= 1.5 # gives an error! will first try to apply & to 1 and x. It would actually be more proper to use the numpy operators `logical_and` and `logical_or`, e.g. >>> np.logical_and(x>=1., x<=1.5) array([False, False, False, True, False], dtype=bool) but for boolean arrays, with proper parentheses, & and | work. See ``_ for more about this and how it relates to & and | in Matlab. Note also that the word `and` can often be used in place of &, but not for NumPy arrays:: >>> (x >= 1) and (x <= 1.5) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() The error message refers to the NumPy functions that can be used as follows:: >>> np.any(x > 0.) # is any x[j] > 0 ? True >>> np.all(x > 0.) # is every x[j] > 0 ? False or as:: >>> b = x>0 # defines boolean array >>> b array([False, False, False, True, True], dtype=bool) >>> b.any() True which can be simplified to one line:: >>> (x>0).any() True This can be useful for things like:: >>> if (x==0.).any(): ... print "some x[j] is zero" ... else: ... print "inverses of x[j]: ", 1./x ... some x[j] is zero