If A is an m×n matrix and B is an n×p matrix, the matrix product C=AB (denoted without multiplication signs or dots) is defined to be the m×p matrix such that
where aik are the elements of matrix A, bkj are the elements of matrix B, and i=1,…,m, k=1,…,n, j=1,…,p. In other words, cij is the dot product of the i-th row of A and the j-th column of B.
Like with the dot product, there are a few ways to perform matrix multiplication in Python. As discussed in the previous lab, the calculations are more efficient in the vectorized form. Let's discuss the most commonly used functions in the vectorized form. First, define two matrices:
A = np.array([[4, 9, 9], [9, 1, 6], [9, 2, 3]])print("Matrix A (3 by 3):\n", A)B = np.array([[2, 2], [5, 7], [4, 4]])print("Matrix B (3 by 2):\n", B)
You can multiply matrices A and B using NumPy package function np.matmul():
np.matmul(A, B)
Which will output 3×2 matrix as a np.array. Python operator @ will also work here giving the same result:
Mathematically, matrix multiplication is defined only if number of the columns of matrix A is equal to the number of the rows of matrix B (you can check again the definition in the secition 1 and see that otherwise the dot products between rows and columns will not be defined).
Thus, in the example above (2), changing the order of matrices when performing the multiplication BA will not work as the above rule does not hold anymore. You can check it by running the cells below - both of them will give errors.
try: np.matmul(B, A)except ValueError as err: print(err)
try: B @ Aexcept ValueError as err: print(err)
So when using matrix multiplication you will need to be very careful about the dimensions - the number of the columns in the first matrix should match the number of the rows in the second matrix. This is very important for your future understanding of Neural Networks and how they work.
However, for multiplying of the vectors, NumPy has a shortcut. You can define two vectors x and y of the same size (which one can understand as two 3×1 matrices). If you check the shape of the vector x, you can see that :
x = np.array([1, -2, -5])y = np.array([4, 3, -1])print("Shape of vector x:", x.shape)print("Number of dimensions of vector x:", x.ndim)print("Shape of vector x, reshaped to a matrix:", x.reshape((3, 1)).shape)print("Number of dimensions of vector x, reshaped to a matrix:", x.reshape((3, 1)).ndim)
Following the matrix convention, multiplication of matrices 3×1 and 3×1 is not defined. For matrix multiplication you would expect an error in the following cell, but let's check the output:
x @ y
You can see that there is no error and that the result is actually a dot product x⋅y! So, vector x was automatically transposed into the vector 1×3 and matrix multiplication xTy was calculated. While this is very convenient, you need to keep in mind such functionality in Python and pay attention to not use it in a wrong way. The following cell will return an error:
try: np.matmul(x.reshape((3, 1)), y.reshape((3, 1)))except ValueError as err: print(err)
You might have a question in you mind: does np.dot() function also work for matrix multiplication? Let's try it:
np.dot(A, B)
Yes, it works! What actually happens is what is called broadcasting in Python: NumPy broadcasts this dot product operation to all rows and all columns, you get the resultant product matrix. Broadcasting also works in other cases, for example:
A - 2
Mathematically, subtraction of the 3×3 matrix A and a scalar is not defined, but Python broadcasts the scalar, creating a 3×3np.array and performing subtraction element by element. A practical example of matrix multiplication can be seen in a linear regression model. You will implement it in this week's assignment!