A6: NumPy (Part-2): Indexing, slicing, broadcasting, fancy indexing, boolean masking & universal functions (ufuncs).

Junaid Qazi, PhD
8 min readJan 30, 2020

This article is a part of “Data Science from Scratch — Can I to I Can” series.

Click here for the previous article/lecture on “A5: NumPy (Part-1): Arrays, random module, array methods & attributes.”

Hi Guys,
In the previous article/lecture, we learned about NumPy arrays along with other basic concepts in NumPy. Let’s move on and talk about indexing, slicing, broadcasting, fancy indexing, and boolean masking. We will also talk about arithmetic operation on NumPy arrays along with universal function (ufuncs) at the end of this article.

In NumPy arrays, indexes are zero based — first element in a row is referenced using “0” — and we can select a slice of array using indexes as well. Let’s learn by doing!

Indexing & slicing 1-D arrays (vectors):

Let’s start with indexing & slicing of 1-D arrays (vectors) with a simple example.

In a simplest case, selecting one or more elements of NumPy array "array_1d" looks very similar to python lists.

Similar to python lists, we can grab a value from NumPy array using -ve index (starts with -1) as well.

Let’s grab a range of elements from "array_1d". (using print statement with some text is very useful and considered a good programming practice)

We can use -ve index to grab a range of values/items from our array "array_1d" as well!

Let’s grab the values from "array_1d" everything up to index 2 "[:2]", and everything from index "[2:]". We don’t need to give start and stop indices, they are optional in this case.

What if we need to assigning a new value to a certain index in the array? Its easy!

Notice, the first element is changed to -102. It was -10 in the original "array_1d"(see above)

In case, the index does not exist, we get an "IndexError". Let’s try a bigger number, e.g. 305 as index (we know 305 is not in the index of "array_1d").

To avoid such errors, it’s always a good idea to get the size of the array using array_1d.size.

Indexing & slicing 2-D arrays (matrices):

Lets create an array with 24 elements using arange() and convert it to 2D matrix using "shape". (note, 6 * 4 = 24)

Now, from the above matrix array_2d, we can grab a complete row by passing a row number (starting from 0).

We have 6 rows in array_2d. Let's grab 3rd row using negative index, we need to pass in -4 in this case. (Remember, -0 is same as 0, try it)

A single element can be easily accessed from 2D array and the general format is: array_2d[row][col] or array_2d[row,col].

Generally, we use [row,col], easier to use comma ',' for clarity. So, let’s get an individual element/value from row = 5and column = 2.

Want to get a slice of 2D array? Its easy!

array_2d[0:2,0:2] =>array_2d[from_row_0:till_row_2(2 not included), from_col_0:till_col_2(2 not included)]

Broadcasting:

Numpy arrays are different from normal Python lists because of their ability to broadcast. We will only cover the basics, for further details on broadcasting rules, click here . Another good read on broadcasting!

Lets start with some simple examples:

Now, take a slice of the array array_1d and set it equal to some number, say 500. Setting a slice to 500 will actually broadcast it to the selected elements of the array_1d.

Let’s try broadcasting with a 2D array, we can use np.onec() to create a 4x4 matrix.

We can broadcast array on another array or a value on an array using mathematical operator e.g. +.

Here is another broadcasting example, read the comments please!

If you still have some confusion on broadcasting, let’s try to understand visually and then hands-on code!

Now, our array_1 is “1 row & 3 columns” and our array_2 is “1 row & 1 column”. Let’s broadcast array_2 to array_1. (pictures source broadcasting)

Now, let’s consider broadcasting an array of “3 rows & 1 column” to an array of “3 rows & 3 columns”.

Another one, little complex concept, consider broadcasting an array of “3 columns & 1 row” to an array of “3 rows & 1 columns”.

I hope the concept of broadcasting is clear now!

Fancy Indexing (Good to know):

Fancy indexing allows us to select entire rows or columns out of order.
Lets create a NumPy array_2d to see how it works! Do you remember, zeros(), range(), shape and broadcasting? let's revise these concepts :)

* array_2d = np.zeros((5,5))
* array_2d[1]=1 # broadcasting 1 to the 2nd row at index 1
* array_2d[2]=2 # broadcasting 2 to the 2nd row at index 2
* array_2d[3]=3 # broadcasting 3 to the 2nd row at index 3
* array_2d[4]=4 # broadcasting 4 to the 2nd row at index 4
* array_2d # see how the matrix look like!

This above process is tedious!, You think you can use a for loop? Ok, let’s try for loop for the tasks given above! (The comments are provided for revisions!)

Fancy indexing allows us to grab any row using its index, let’s grab row 1, 2 and 3. We need to pass in a list of required rows in square brackets!

Here is another example of fancy indexing: (Read the comments please)

We can grab any random row, try 2 and 4!

We can grab the columns as well!

Boolean mask arrays:

Boolean mask is very useful and handy, when it comes to count, modify, extract or manipulate values in an array based on certain condition or criteria.

For example:

  • We want to count all the values greater than a certain value.
  • We set a threshold, and want to get-rid of outliers in our data.

In NumPy, Boolean masking is often the most efficient way to accomplish these types of tasks.

Let’s start with a simple example.

We can apply condition such as >, <, == etc

We can create a mask to filter out the even numbers in our “array_1d”

Now, mod_2_mask_id is our mask, in the masking operation, we simply index on the boolean array "array_1d", that will return a 1D array filled with all the values that meet the condition -- all the values in position at which the mask array (mod_2_mask_id) is "True".

Similarly, in the above example, we have a 2D array array_2d and mask mask_mod_2_2d. Based on this mask, we can filter out the odds from our array_2d, let's do this!

Arithmetic operations on arrays:

We can perform arithmetic operations with NumPy arrays.
Let’s learn with examples: (read the comments please)

Notice the warning in division, comments are provided!

Let’s square the array and multiply with some number.

Universal functions (ufuncs):

Time to talk about universal functions. NumPy have a range of built-in universal functions (ufunc). These are essentially just mathematical operations and we can use them to perform specific task, associate with the function, across the NumPy array.

Let’s learn with examples:

NumPy even have built-in functions for degree to radians and vice versa!

We are done with the NumPy essentials! I suggest, do a quick review and move on the exercises for practice in the next article.

See you in the next lecture on “A7: NumPy (Practice Exercises)”.

Note: This complete course, including video lectures and jupyter notebooks, is available on the following links:

About Dr. Junaid Qazi:

Dr. Junaid Qazi is a Subject Matter Specialist, Data Science & Machine Learning Consultant. He is a Professional Development Coach, Mentor, Author, and Invited Speaker. He can be reached for consulting projects and/or professional development training via LinkedIn or through ScienceAcademy.ca.

--

--

Junaid Qazi, PhD

We offer professional development, corporate training, consulting, curriculum and content development in Data Science, Machine Learning and Blockchain.