.. _python_vect:


=============================================================
Python vectorization
=============================================================

As observed in :ref:`rotating`, the speed of Python code can often be
increased greatly by vectorizing mathematical expressions that are applied
to NumPy arrays rather than using loops.

Here are a few more examples of how to do this.
In all these examples, we use the array `x` defined by::

    >>> import numpy as np
    >>> x = np.linspace(-2., 2., 5)
    >>> x
    array([-2., -1.,  0.,  1.,  2.])

although the point of this vectorization is to make the expressions work
well for *much* longer arrays.

Many NumPy functions apply directly to arrays, e.g.::

    >>> np.sin(x)
    array([-0.90929743, -0.84147098,  0.        ,  0.84147098,  0.90929743])

    >>> np.abs(x)
    array([ 2.,  1.,  0.,  1.,  2.])

For function definitions where different formulas are applied over different
values of `x`, the `np.where` function can generally be used.

Consider the function

:math:`f(x) = \left\{ \begin{array}{ll} x^2&~~x\leq 0.5\\ -x&~~x>0.5 \end{array}\right.`

Computing `y=f(x)` at all points in the array `x` could be done by the
loop::

    y = np.zeros(x.shape)  # initialize to shape of x

    for j in range(len(x)):
        if x[j] <= 0.5:
            y[j] = x[j]**2
        else:
            y[j] = -x[j]

This can be rewritten to be faster as::

    y = np.where(x <= 0.5, x**2, -x)

both should end up with::

    >>> y
    [ 4.  1.  0. -1. -2.]

Sometimes you may need to use `where` repeatedly, e.g. for the function
    
:math:`f(x) = \left\{ \begin{array}{ll} x^2&~~x\leq 0.5\\ -x&~~0.5<x<1.5 \\
x^4&~~ x>= 1.\end{array}\right.`

the loop::

    for j in range(len(x)):
        if x[j] <= 0.5:
            y[j] = x[j]**2
        elif x[j]<1.5:
            y[j] = -x[j]
        else:
            y[j] = x[j]**4

can be replaced by::

    y = np.where(x <= 0.5, x**2, -x)
    y = np.where(x < 1.5, y, x**4)


Note that `y` is reused in the second statement.  Either version should
give::

    >>> y
    array([  4.,   1.,   0.,  -1.,  16.])


Booleans for NumPy arrays
-------------------------

Note that conditions like `x <= 0.5` in the above example actually return
NumPy boolean arrays, e.g. ::

    >>> x <= 1.5
    array([ True,  True,  True,  True, False], dtype=bool)

You can combine these using & for `and` and | for `or`, e.g. ::

    >>> (x >= 1) & (x <= 1.5)
    array([False, False, False,  True, False], dtype=bool)

it's best to put in parentheses to make sure such conditions are parsed
properly.  In fact if you leave them out above you will get an error message
because & is the Python & that has higher precedence than <=, so ::

    >>> x >= 1 & x <= 1.5   # gives an error!

will first try to apply & to 1 and x.

It would actually be more proper to
use the numpy operators `logical_and` and `logical_or`, e.g.

    >>> np.logical_and(x>=1., x<=1.5)
    array([False, False, False,  True, False], dtype=bool)

but for boolean arrays, with proper parentheses, & and | work.  See
`<http://www.scipy.org/NumPy_for_Matlab_Users#logicalNotes>`_ for more about
this and how it relates to & and | in Matlab.

Note also that the word `and` can often be used in place of &, but not
for NumPy arrays::

    >>> (x >= 1) and (x <= 1.5)
    ValueError: The truth value of an array with more than one element is
    ambiguous. Use a.any() or a.all()

The error message refers to the NumPy functions that can be used as
follows::

    >>> np.any(x > 0.)    # is any x[j] > 0 ?
    True

    >>> np.all(x > 0.)    # is every x[j] > 0 ?
    False

or as::

    >>> b = x>0   # defines boolean array
    >>> b
    array([False, False, False,  True,  True], dtype=bool)

    >>> b.any()
    True

which can be simplified to one line::

    >>> (x>0).any()
    True

This can be useful for things like::

    >>> if (x==0.).any():
    ...     print "some x[j] is zero"
    ... else:
    ...     print "inverses of x[j]:  ", 1./x
    ... 
    some x[j] is zero