Picture of the authorMindect

Vectorization Part 1

Introduction

In this video, you learn about a very useful concept called vectorization. When implementing learning algorithms, vectorization makes your code both shorter and more efficient. Learning how to write vectorized code allows you to take advantage of modern numerical linear algebra libraries, and even GPU hardware (graphics processing unit), designed to speed up code execution.

Let’s dive into a concrete example using parameters w and b, where w is a vector with three numbers, and you also have a vector of features x with three numbers.

V1

In this example, n = 3. Notice that in linear algebra, indexing starts from 1, so the first value is subscripted w_1 and x_1.

Python Code Implementation

In Python, you can define these variables w, b, and x using arrays like this:

V2

We’re using the NumPy library, the most widely used numerical linear algebra library in Python and machine learning. Note that Python indexing starts from 0, so w[0] accesses the first value of w, and similarly for x[0].

Non-Vectorized Code Implementation

Here’s an implementation without vectorization for computing the model's prediction:

V3

You multiply each parameter w with its associated feature. This approach is inefficient, especially when n is large.

A more efficient implementation uses a for loop. In math, this operation is represented by a summation:

f(x)=j=1nwjxj+bf(x) = \sum_{j=1}^{n} w_j x_j + b

The code implementation would look like this:

V4

Notice that in Python, the range(0, n) means that j goes from 0 to n-1.

Vectorized Code Implementation

With vectorization, you can compute the same function with a single line of code:

f(x)=wTx+bf(x) = w^T x + b

This can be implemented in Python as:

fp=np.dot(w,x)+bfp = np.dot(w, x) + b

V5

This vectorized implementation is not only shorter but also faster, especially for large n. Behind the scenes, the NumPy dot function uses parallel hardware, including the CPU and possibly the GPU, to speed up the computation.

Benefits of Vectorization

  1. Shorter code – Easier to write and read.
  2. Faster execution – The parallel nature of vectorized operations makes them much more efficient.

By using vectorization, your code will be more practical and efficient, especially when n is large.


Conclusion

To summarize, vectorization makes your code more concise and faster. This magic behind vectorization is possible because of modern hardware that allows for parallel processing. In the next part, we’ll explore what happens behind the scenes to make vectorized code run faster.

On this page

Edit on Github Question? Give us feedback