{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
},
"colab": {
"name": "Week1_3_python.ipynb",
"provenance": []
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "O_Et2AH3R1Lb"
},
"source": [
"# Week 1 Coding Lecture 3: Matrices\n",
"If you are familiar with linear algebra or physics, then you already know the importance of matrices. If you have not encountered matrices before, you can think of them as several row vectors stacked on top of each other or as several column vectors side by side. For instance, the following matrix can either be seen as four row vectors, each with three entries, or as three column vectors, each with four entries. We call it a $4\\times 3$ matrix. \n",
"\n",
"$$A = \\begin{pmatrix} -1 & 2 & 1 \\\\ 3 & 0 & -1 \\\\ 4 & -2 & 2 \\\\ -2 & 1 & 3 \\end{pmatrix}.$$\n",
"\n",
"In general, an $m\\times n$ matrix has $m$ rows and $n$ columns. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HnV6mjscR1Ld"
},
"source": [
"## Constructing matrices\n",
"In python, matrices are represented as 2D arrays. We have already seen some simple examples of matrices - row and column vectors are actually special cases of matrices. A row vector with $n$ entries is a $1\\times n$ matrix and a column vector with $n$ entries is an $n\\times 1$ matrix. \n",
"\n",
"We know how to make row and column vectors in python by first making a 1D array and then using the reshape command. In principle, you could use this method to make all matrices, but it quickly becomes tedious. Instead, there is a more direct way to construct a 2D array. As an example, let's construct the matrix $A$. "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "aLV_zwwBR1Lg",
"outputId": "01e9f6f5-2d99-4324-cc4f-cd0bc3f273d2"
},
"source": [
"import numpy as np\n",
"\n",
"A = np.array([[-1, 2, 1], [3, 0, -1], [4, -2, 2], [-2, 1, 3]])\n",
"print(A)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[-1 2 1]\n",
" [ 3 0 -1]\n",
" [ 4 -2 2]\n",
" [-2 1 3]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6S4F5fVrR1Lh"
},
"source": [
"The syntax is very similar to that for 1D arrays, but with some extra brackets. One set of brackets encloses the entire matrix, and then each row is enclosed in another set of brackets. You then use commas to separate entries within a row and to separate different rows. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nkGZfRRMR1Lh"
},
"source": [
"As practice, let's also make a $3\\times 4$ matrix $B$ and two $3\\times 3$ matrices $C$ and $D$. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "nMbWRDQJR1Li",
"outputId": "a1d46d95-1829-4aec-bf4e-cc95c88bfd5d"
},
"source": [
"B = np.array([[1, -2, 0, 3], [2, 1, -4, 1], [-3, 0, 1, 1]])\n",
"print(B)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1 -2 0 3]\n",
" [ 2 1 -4 1]\n",
" [-3 0 1 1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "vzFbrgcNR1Li",
"outputId": "6d9a5e4a-d562-458d-f450-5f46c856d14e"
},
"source": [
"C = np.array([[1, 0, 2], [3, -1, 1], [2, 2, 0]])\n",
"print(C)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1 0 2]\n",
" [ 3 -1 1]\n",
" [ 2 2 0]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "qMQmx8DpR1Li",
"outputId": "74c1eab3-1bef-4380-cefb-e4d59e65878f"
},
"source": [
"D = np.array([[1, 2, -1], [3, -2, 2], [-1, 1, 0]])\n",
"print(D)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1 2 -1]\n",
" [ 3 -2 2]\n",
" [-1 1 0]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FlT8y1DnR1Lj"
},
"source": [
"Notice that these arrays still have the same type as the vectors and arrays we discussed last time, but they have a different number of dimensions and shape. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "-rjo8cv9R1Lj",
"outputId": "ba149c3c-ee41-412d-c812-83b286491242"
},
"source": [
"print(\"The matrix A has the following type:\")\n",
"print(type(A))\n",
"print(\"And the following number of dimensions:\")\n",
"print(A.ndim)\n",
"print(\"And the following shape:\")\n",
"print(A.shape)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"The matrix A has the following type:\n",
"\n",
"And the following number of dimensions:\n",
"2\n",
"And the following shape:\n",
"(4, 3)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "P4UijkOgR1Lk"
},
"source": [
"It's also worth noting that this gives us a much easier way to construct row and column vectors. For example, if we wanted to make the vectors \n",
"$$\\mathbf{x} = \\begin{pmatrix} 1 \\\\ 2 \\\\ 3 \\end{pmatrix}$$\n",
"and \n",
"$$\\mathbf{y} = \\begin{pmatrix} 4 & 5 & 6 \\end{pmatrix},$$\n",
"we could use the code"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Adn_JqKhR1Lk",
"outputId": "6c51b413-2e87-4800-e420-25e81639f11f"
},
"source": [
"x = np.array([[1], [2], [3]])\n",
"print(x)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[1]\n",
" [2]\n",
" [3]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "SDti-ijiR1Lk",
"outputId": "b4500c3e-c232-4e81-e975-463a3739d2d2"
},
"source": [
"y = np.array([[4, 5, 6]])\n",
"print(y)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[4 5 6]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aU37xmDMR1Lk"
},
"source": [
"This is often much easier than using the reshape command. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xXvmLee_R1Ll"
},
"source": [
"## Matrix arithmetic\n",
"Since matrices have the same type (ndarray) as the arrays we looked at last time, it should not surprise you that the basic arithmetic operations work the same with matrices as they do with vectors. In particular, you can add or subtract the same number from each entry of a matrix and multiply or divide every entry in a matrix by the same number, or even raise every entry of a matrix to the same power. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "AER5HlDFR1Ll",
"outputId": "74375901-3b72-4114-9110-f98ae6f85a8b"
},
"source": [
"print(A + 5)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[4 7 6]\n",
" [8 5 4]\n",
" [9 3 7]\n",
" [3 6 8]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "aTCkmSX7R1Ll",
"outputId": "db146cac-3293-44c2-e019-5b15267cf824"
},
"source": [
"print(B - 2)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[-1 -4 -2 1]\n",
" [ 0 -1 -6 -1]\n",
" [-5 -2 -1 -1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "I4xs2rrMR1Lm",
"outputId": "74f5a642-6a6e-4b9f-8d2b-a585a4fa1e56"
},
"source": [
"print(C * 3)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 3 0 6]\n",
" [ 9 -3 3]\n",
" [ 6 6 0]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "N5gSvegDR1Lm",
"outputId": "efea6cb7-a2f5-4387-e579-0e9f35500fa0"
},
"source": [
"print(D / 4)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 0.25 0.5 -0.25]\n",
" [ 0.75 -0.5 0.5 ]\n",
" [-0.25 0.25 0. ]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "SyelJfV6R1Lm",
"outputId": "a7e55bf5-62d8-4a53-8533-6a7a116894fa"
},
"source": [
"print(A ** 2)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1 4 1]\n",
" [ 9 0 1]\n",
" [16 4 4]\n",
" [ 4 1 9]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6Yq6iLW0R1Ln"
},
"source": [
"You can also combine matrices with elementwise operations. For instance, we could add all the corresponding elements of $C$ and $D$. (That is, add the top left entry of $C$ to the top left entry of $D$, then add the top middle entry of $C$ to the top middle entry of $D$, etc.)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "pd_DacBIR1Ln",
"outputId": "dd4a514c-d8d5-4bc0-b014-efdc6e35417c"
},
"source": [
"print(C + D)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 2 2 1]\n",
" [ 6 -3 3]\n",
" [ 1 3 0]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-crJw97AR1Ln"
},
"source": [
"The same thing works for subtraction, multiplication, division and exponentiation. For example, "
]
},
{
"cell_type": "code",
"metadata": {
"id": "l0JXgpbcR1Ln",
"outputId": "81d33215-35a2-4a64-b40a-61bb3174d433"
},
"source": [
"print(C - D)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 0 -2 3]\n",
" [ 0 1 -1]\n",
" [ 3 1 0]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "0RoJR3TXR1Lo",
"outputId": "48d16ff7-afb6-4a6b-f80a-6fe92b542098"
},
"source": [
"print(C * D)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1 0 -2]\n",
" [ 9 2 2]\n",
" [-2 2 0]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5pkwBYTXR1Lo"
},
"source": [
"**Side note about types:** I did not show `C / D` because some one of the entries of `D` is zero and division by zero is not allowed. You should try it on your own and see what error message you get. I also did not show `C ** D`, but this is for more technical reasons. The problem is that we only used whole numbers when constructing our matrices and so python assumed that we wanted elements of type `int` (more technically, type `int64`, but don't worry about the difference now). Unfortunately, python does not allow raising integers to negative powers. This is another example of how the type of a variable determines which operations you can perform on it. There are a few ways to fix this. The simplest is to construct the matrix using decimals in the first place so that python knows the elements are supposed to be floats. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "9q-fZEI8R1Lo",
"outputId": "1a534ee1-829a-4a4d-a8cf-ee1056d11006"
},
"source": [
"C_float = np.array([[1.0, 0, 2], [3, -1, 1], [2, 2, 0]])\n",
"print(C_float ** D)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1. 0. 0.5]\n",
" [27. 1. 1. ]\n",
" [ 0.5 2. 1. ]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-nRTrdYZR1Lp"
},
"source": [
"Notice that I replaced the first entry \"1\" with \"1.0\". If any of the entries have a decimal point, then python will make every entry a float (more technically, a float64 (or what we used to call a double back in the day), but don't worry about the difference now). I really don't understand why Python chooses to call it float64 instead of just double.\n",
"\n",
"A better way is to use one of the pre-defined properties of ndarrays, the `astype` function. You can use the code "
]
},
{
"cell_type": "code",
"metadata": {
"id": "OsNtnrGUR1Lq",
"outputId": "73255216-6b5d-4a8a-d0d0-75fa2c3d668b"
},
"source": [
"C_float = C.astype(\"float64\")\n",
"print(C ** D)"
],
"execution_count": null,
"outputs": [
{
"output_type": "error",
"ename": "ValueError",
"evalue": "Integers to negative integer powers are not allowed.",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mC_float\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mC\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mastype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"float64\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mC\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0mD\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mValueError\u001b[0m: Integers to negative integer powers are not allowed."
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7v1A4aj8R1Lr"
},
"source": [
"This won't usually be a problem for us, because most of the arrays we create won't just have whole numbers in them, and so they will automatically use floats already. However, it is a good thing to keep in mind. If you aren't sure what type the elements of your array are, you can use the `dtype` property to check. For instance, "
]
},
{
"cell_type": "code",
"metadata": {
"id": "PQ72EGIOR1Lr",
"outputId": "ca7ebf4f-d6cb-4d60-8108-26bae0184292"
},
"source": [
"print(\"The elements of A have type: \")\n",
"print(A.dtype)\n",
"print(\"The elements of C_float have type:\")\n",
"print(C_float.dtype)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"The elements of A have type: \n",
"int64\n",
"The elements of C_float have type:\n",
"float64\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XTburqY-R1Lr"
},
"source": [
"**End of side note**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mUxXbu1bR1Lr"
},
"source": [
"Just like with 1D arrays, the shapes of your arrays must match. It doesn't make sense to perform elementwise operations with two matrices of different dimensions. For example, we cannot add $A$ and $B$, because there is no way to match up elements of $A$ with corresponding elements of $B$. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "xYVC8zxkR1Ls",
"outputId": "a70e03a6-35dc-4f18-decf-0b8f83166b99"
},
"source": [
"print(A + B)"
],
"execution_count": null,
"outputs": [
{
"output_type": "error",
"ename": "ValueError",
"evalue": "operands could not be broadcast together with shapes (4,3) (3,4) ",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mA\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mB\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (4,3) (3,4) "
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8Gl7-heSR1Lx"
},
"source": [
"The same is true for all of the elementwise operations: addition, subtraction, multiplication, division and exponentiation. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v_FANP9iR1Ly"
},
"source": [
"## Accessing and Modifying Matrices\n",
"Accessing and modifying entries of a matrix works much like it did with 1D arrays. Recall that you can specify an entry of a vector by giving an integer index, so `x[2]` means \"the third entry (index 2) of x\". You can use similar notation with 2D arrays, but the output might look slightly unusual at first. For example, if A were a 1D array, then the code `A[2]` would give you the third entry (index 2) of the array. If we try the same thing with a matrix, we get "
]
},
{
"cell_type": "code",
"metadata": {
"id": "1wIdL21VR1Ly",
"outputId": "10756b72-f41d-4d8b-fdb9-f69df8f64427"
},
"source": [
"print(A[2])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[ 4 -2 2]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iK9RDf2yR1Lz"
},
"source": [
"This is the third *row*, not the third *entry*, of A. This works for any index (up to the number of rows in the matrix): "
]
},
{
"cell_type": "code",
"metadata": {
"id": "__ob5nK6R1L0",
"outputId": "82a57fa8-7796-4e74-c8ba-d3f31f5cc146"
},
"source": [
"print(A[0])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[-1 2 1]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "-ypGgZ0xR1L0",
"outputId": "9a431c83-004a-4e3a-dab0-851c99d62bb1"
},
"source": [
"print(A[1])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[ 3 0 -1]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "E3TcNBRFR1L0",
"outputId": "22c4dcef-5856-4388-8f07-7fee7031fecb"
},
"source": [
"print(A[2])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[ 4 -2 2]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "BJPb3NOLR1L0",
"outputId": "428ac873-2c05-49f2-b38d-84538878e276"
},
"source": [
"print(A[3])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[-2 1 3]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NSO13yXLR1L1"
},
"source": [
"You can also use negative index syntax. (Remember, negative indices count backwards from the end of an array, so if A were a 1D array then `A[-1]` would be the last entry. Since A is a 2D array, this gives the last row instead. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "lDJiM_uzR1L1",
"outputId": "c6fda907-a869-4d6d-bb1f-7d728fee1902"
},
"source": [
"print(A[-1])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[-2 1 3]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lSR2kY0WR1L1"
},
"source": [
"If you want to access an individual element of a matrix instead of an entire row, you have to use one more set of brackets. For example, the second entry (index 1) in the third row (index 2) can be found with the code "
]
},
{
"cell_type": "code",
"metadata": {
"id": "j_x4ZUo9R1L1",
"outputId": "440a86d2-4e25-43c2-d285-be0373d2dfef"
},
"source": [
"print(A[2, 1])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"-2\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KP3T-M9LR1L2"
},
"source": [
"The general syntax is `name_of_array[row_index, column_index]`. It is important to remember the order: row index and then column index. This mirrors the mathematical syntax. In mathematics, the entry at row $i$, column $j$ of the matrix $A$ is usually denoted by $a_{ij}$. The second entry in the third row (i.e., row 3, column 2) would therefore be written $a_{32}$. Notice that python starts counting indices at 0 and not 1, so the indices in the python code are one smaller than the row/column numbers in mathematical notation. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VCb_U8cDR1L2"
},
"source": [
"You can also use slice syntax to access more than one element at a time. For instance, "
]
},
{
"cell_type": "code",
"metadata": {
"id": "OE1HwMy5R1L2",
"outputId": "43ee5e9c-fce8-4e0d-c0d5-de56800c9bdc"
},
"source": [
"print(A[1:3, 0:3])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 3 0 -1]\n",
" [ 4 -2 2]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gTcjSa7NR1L2"
},
"source": [
"pulls out the second and third rows (because `1:3` means index 1 and 2, but **not** index 3) and the first through third columns (because `0:3` means index 0, index 1 and index 2, but **not** index 3). \n",
"\n",
"The shortcuts for slicing that we talked about in the last lecture apply here as well. For example, if you skip the `start` index of a slice, then python starts from the beginning of the array, so "
]
},
{
"cell_type": "code",
"metadata": {
"id": "SFHGS9QfR1L2",
"outputId": "3f74c3a6-6c1a-4d93-c07b-7c1a5b2f9cbc"
},
"source": [
"print(A[:3, 0:3])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[-1 2 1]\n",
" [ 3 0 -1]\n",
" [ 4 -2 2]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nTiVsUmKR1L3"
},
"source": [
"prints the first through third rows and the first through third columns. Likewise, if you skip the `stop` index of a slice, then python goes all the way to the end of the array, so "
]
},
{
"cell_type": "code",
"metadata": {
"id": "JFl594SAR1L3",
"outputId": "57007387-81f8-419b-f0cf-ec0c4b9db081"
},
"source": [
"print(A[1:3, 1:])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 0 -1]\n",
" [-2 2]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VKrJWdAGR1L3"
},
"source": [
"prints the second through third row (`1:3` means to include index 1 and index 2, but not index 3) and the second through last column (`1:` means to start at index 1 and go to the last index). "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uaoE1iBcR1L3"
},
"source": [
"If you skip both the `start` and the `stop` indices (so you just have a `:`), then python uses *every* index. For instance, "
]
},
{
"cell_type": "code",
"metadata": {
"id": "_Gw0gNU_R1L4",
"outputId": "33dbaf7e-eb42-4c13-ef18-b57f356f1b68"
},
"source": [
"print(A[:, 0:2])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[-1 2]\n",
" [ 3 0]\n",
" [ 4 -2]\n",
" [-2 1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_jegDYZfR1L5"
},
"source": [
"prints *every* row of the first and second columns (because `:` means every index and `0:2` means index 0 and index 1, but not index 2). "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cAPliEC1R1L6"
},
"source": [
"It is also possible to mix slices with single indices. For example, if you wanted the entire second row of `A`, you could use the code"
]
},
{
"cell_type": "code",
"metadata": {
"id": "m-MUt4JfR1L7",
"outputId": "2869732a-b20a-43e9-e5cc-0edbaff4a35f"
},
"source": [
"print(A[1, :])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[ 3 0 -1]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZzhbPasPR1L7"
},
"source": [
"Likewise, if you wanted the entire third column of `A`, you could use"
]
},
{
"cell_type": "code",
"metadata": {
"id": "5oTvGYYzR1L7",
"outputId": "b7455235-aaa3-4d58-b9dc-d506d7afad31"
},
"source": [
"print(A[:, 2])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[ 1 -1 2 3]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0VUSrBVBR1L8"
},
"source": [
"Unfortunately, python throws away some information about the shape of your matrix when you do this. As you can see, the previous two answers are both 1D arrays, not rows or columns. If the shape is important to you (and it almost always will be in this class), then you have to manually reshape these arrays afterwards. For example, "
]
},
{
"cell_type": "code",
"metadata": {
"id": "URSGHr3YR1L8",
"outputId": "997f1d4f-3a4f-4cc2-fcb3-7d8332367d3d"
},
"source": [
"print(np.reshape(A[1, :], (1, -1)))"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 3 0 -1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "w_RKBzmJR1L8",
"outputId": "6a43557d-5c28-4b4d-92ce-bde2b9140422"
},
"source": [
"print(np.reshape(A[:, 2], (-1, 1)))"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1]\n",
" [-1]\n",
" [ 2]\n",
" [ 3]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "M5Rfn8vOR1L8"
},
"source": [
"A useful trick to remember is that you can replace a single number with a slice to avoid this. For example, the slice `2:3` is exactly the same as the index 2 (because `2:3` means to start at index 2 and go up to just before index 3). Likewise, the slice `1:2` is the same as just the index 1. You could therefore get the second row of `A` or the third column of `A` with the code "
]
},
{
"cell_type": "code",
"metadata": {
"id": "FCvIaOIJR1L9",
"outputId": "ffbaf40f-0676-4dca-f747-eda9939fdedc"
},
"source": [
"print(A[1:2, :])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 3 0 -1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "dpSMOGA6R1L9",
"outputId": "9e037ee5-3d06-4f40-a2b5-cf853a8c4c96"
},
"source": [
"print(A[:, 2:3])"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 1]\n",
" [-1]\n",
" [ 2]\n",
" [ 3]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AHrHK7-sR1L9"
},
"source": [
"This is a little ugly, but saves us from having to reshape arrays over and over again. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BO27BjViR1L9"
},
"source": [
"## Matrix Multiplication\n",
"In mathematics (particularly linear algebra), we very rarely use elementwise multiplication for matrices. Instead, another definition of multiplication is much more common. (This new version is so common that it doesn't have a special name; it is just \"matrix multiplication\".) \n",
"\n",
"We will not really need matrix multiplication until week 3 in this class, but it will become very important later on. We are introducing it right now because it is essentially impossible to avoid in MATLAB, and so half of the class will need to know these definitions right away. If you are focusing on python, you should still read this section."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "38SMl8xNR1L9"
},
"source": [
"### Dot Product\n",
"The definition of matrix multiplication is fairly messy. Before we jump into the full definition, let's take a moment to review the concept of the *dot product*. The dot product of two vectors of the same length is the sum of the products of their corresponding entries. For example, if \n",
"\n",
"$$\\mathbf{x} = \\begin{pmatrix} 2 \\\\ -3 \\\\ 1 \\\\ 4 \\end{pmatrix} \\textrm{ and } \\mathbf{y} = \\begin{pmatrix} -1 \\\\ -2 \\\\ 1 \\\\ 3 \\end{pmatrix}$$\n",
"\n",
"then the dot product of $\\mathbf{x}$ and $\\mathbf{y}$ (which we denote $\\mathbf{x}\\cdot\\mathbf{y}$ is \n",
"\n",
"$$\\mathbf{x}\\cdot\\mathbf{y} = (2)(-1) + (-3)(-2) + (1)(1) + (4)(3) = 17.$$\n",
"\n",
"We can calculate this in python with "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "OrzeyrjFR1L9",
"outputId": "80fa6e20-98a2-4145-b314-cccb21541dd0"
},
"source": [
"x = np.array([[2], [-3], [1], [4]])\n",
"y = np.array([[-1], [-2], [1], [3]])\n",
"x_dot_y = x[0, 0] * y[0, 0] + x[1, 0] * y[1, 0] + x[2, 0] * y[2, 0] + x[3, 0] * y[3, 0]\n",
"print(\"The dot product of x and y is: \")\n",
"print(x_dot_y)\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"The dot product of x and y is: \n",
"17\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JsivH_AxR1L-"
},
"source": [
"In general, if $\\mathbf{x}$ and $\\mathbf{y}$ are vectors of the same length $n$, then we define \n",
"\n",
"$$\\mathbf{x}\\cdot\\mathbf{y} = x_1y_1 + x_2y_2 + \\cdots + x_ny_n.$$\n",
"\n",
"Note that this definition only makes sense when $\\mathbf{x}$ and $\\mathbf{y}$ are the same length. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "x_y12HMjR1L-"
},
"source": [
"### Multiplication\n",
"We will define matrix multiplication as follows: If $A$ is an $m\\times n$ matrix and $B$ is an $n\\times k$ matrix, then the product $AB$ is an $m\\times k$ matrix where the entry in the $i$th row and $j$th column is the dot product of the $i$th row of $A$ with the $j$th column of $B$. \n",
"\n",
"It is important to notice that this definition only works when the number of columns of $A$ is the same as the number of rows of $B$, because that way we will only take dot products of vectors of the same length. \n",
"\n",
"This definition may seem overly complicated, but it turns out to be very useful. We will go into much more detail about why when we start solving linear systems. For now, you just need to know how to perform this multiplication in python. As an example, let's multiply the following two matrices: "
]
},
{
"cell_type": "code",
"metadata": {
"id": "Pwo1I9b9R1L-"
},
"source": [
"A = np.array([[-1, 2, 1], [3, 0, 1], [4, -2, 2], [-2, 1, 3]])\n",
"B = np.array([[1, -2], [2, 1], [-3, 0]])"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "U0hS9RJzR1L_"
},
"source": [
"$A$ and $B$ are $4\\times 3$ and $3\\times 2$ matrices, respectively. Since the number of columns of $A$ is the same as the number of rows of $B$, we are allowed to multiply $A$ times $B$. In python, we can do this with the `@` operator: "
]
},
{
"cell_type": "code",
"metadata": {
"id": "UYYZA-NoR1MA",
"outputId": "7a956d57-301a-4cbe-886c-d92978a903a2"
},
"source": [
"print(A @ B)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"[[ 0 4]\n",
" [ 0 -6]\n",
" [ -6 -10]\n",
" [ -9 5]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QwOgL7UYR1MA"
},
"source": [
"As expected from our definition, the result is a $4\\times 3$ matrix (the same number of rows as $A$ and the same number of columns as $B$). \n",
"\n",
"However, since the number of columns of $B$ is not the same as the number of rows of $A$, we cannot multiply $B$ times $A$. If we try it, python will give the following error: "
]
},
{
"cell_type": "code",
"metadata": {
"id": "_Yt7kvWkR1MB",
"outputId": "4e042af5-c301-4bee-cdbf-7807859b6b50"
},
"source": [
"print(B @ A)"
],
"execution_count": null,
"outputs": [
{
"output_type": "error",
"ename": "ValueError",
"evalue": "matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 4 is different from 2)",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mB\u001b[0m \u001b[0;34m@\u001b[0m \u001b[0mA\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mValueError\u001b[0m: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 4 is different from 2)"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kDMGQYmBR1MC"
},
"source": [
"When you see the error \"mismatch in its core dimension\", that tells you that you are trying to multiply matrices when the numbers of rows and columns don't match. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dBlV511tR1MC"
},
"source": [
"In particular, notice that we can't multiply a row vector times a row vector, because it doesn't make sense to multiply an $n\\times 1$ by an $m\\times 1$ (the 1 and the $m$ don't match). We also can't multiply a column vector times a column vector, because it doesn't make sense to multiply a $1\\times n$ by a $1\\times m$ (the $n$ and the 1 don't match). "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Byt0ny72R1MC"
},
"source": [
"From now on, whenever we talk about matrix multiplication in this class (including on all of the homework assignments), we will mean this definition. That is, if you see an expression like $AB$ or $A\\mathbf{x}$, you should use the `@` operator in python. "
]
}
]
}