Python Lesson 21: Joining & Splitting NumPy Arrays (NumPy pt 4)

Hello everybody,

Michael here, and today’s lesson will be on joining and splitting NumPy arrays.

Joining NumPy arrays simply involves concatenating two or more NumPy arrays. However, joining two NumPy arrays isn’t as simple as joining two strings.

To join two or more NumPy arrays together, use the np.concatenate function. Here’s a simpleIn example of NumPy array concatenation:

numpy1 = np.array([0, 0.4, 0.8, 1.2, 1.6, 2])
numpy2 = np.array([2.4, 2.8, 3.2, 3.6, 4, 4.4])
numpy3 = np.concatenate((numpy1, numpy2))
print(numpy3)

[0.  0.4 0.8 1.2 1.6 2.  2.4 2.8 3.2 3.6 4.  4.4]

In order for the np.concatenate function to work properly, you’d need to pass in all the arrays you’d like to concatenate into a single tuple; if you pass in the arrays one-by-one, the function won’t work.

Also, when you concatenate multiple NumPy arrays, the dimensions stay the same:

print(numpy1.ndim)
print(numpy2.ndim)
print(numpy3.ndim)

1
1
1

Numpy1 and numpy2 are both 1-D arrays; the combined array numpy3 is also a 1-D array.

Now, can we join two arrays with different dimensions together? Let’s take a look:

numpy4 = np.array([[3, 6, 9], [4, 8, 12]])
numpy5 = np.array([5, 10, 15])
numpy6 = np.concatenate((numpy4, numpy5))
print(numpy6)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-479431df35d4> in <module>
      1 numpy4 = np.array([[3, 6, 9], [4, 8, 12]])
      2 numpy5 = np.array([5, 10, 15])
----> 3 numpy6 = np.concatenate((numpy4, numpy5))
      4 print(numpy6)

<__array_function__ internals> in concatenate(*args, **kwargs)

ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)

As you can see, trying to concatenate two arrays with different dimensions doesn’t work; all of the arrays you’re trying to concatenate must have the same number of dimensions.

Now, in the array concatenation examples I’ve shown you, the arrays are being joined along the same axis. How could you join arrays along different axes?

You would stack the arrays together. Here’s an example of NumPy array stacking:

numpy7 = np.array([14, 28, 42])
numpy8 = np.array([15, 30, 45])
numpy9 = np.stack((numpy7, numpy8), axis=1)
print(numpy9)

[[14 15]
 [28 30]
 [42 45]]

Stacking arrays is the same as concatenation, except stacking is usually done along a new axis. To stack a NumPy array, use the np.stack function and pass a tuple containing the arrays you want to stack and the axis you want to stack them on; if you don’t pass an axis into np.stack, arrays will automatically be stacked along the first axis.

What would it look like if these arrays were stacked along the first axis rather than the second axis? Take a look:

numpy7 = np.array([14, 28, 42])
numpy8 = np.array([15, 30, 45])
numpy9 = np.stack((numpy7, numpy8))
print(numpy9)

[[14 28 42]
 [15 30 45]]

All of the elements from both arrays will still be present, however, stacking along the first axis creates a 2×3 array, while stacking along the second axis creates a 3×2 array. Both stacked arrays are still 2-D.

  • In case you didn’t figure it out, axis=0 refers to the first axis while axis=1 refers to the second axis.
  • You can’t stack along a non-existent axis; in this example, the stacked array only has two dimensions, therefore you can’t stack along axis=2 because the stacked array has no third dimension and thus has no third axis.

Stacking along axes works well, but what are some other NumPy array stacking methods?

Let’s say you wanted to stack along rows. Here’s how to do so:

numpy7 = np.array([14, 28, 42])
numpy8 = np.array([15, 30, 45])
numpy9 = np.hstack((numpy7, numpy8))
print(numpy9)

[14 28 42 15 30 45]

To stack an array along rows, use the np.hstack function and pass in a tuple containing the arrays you want to stack. In this example, stacking along rows simply merged the two arrays into a single 1-D array with the elements of numpy7 being listed before numpy8.

Now what if you wanted to stack along columns? Here’s how to do so:

numpy7 = np.array([14, 28, 42])
numpy8 = np.array([15, 30, 45])
numpy9 = np.vstack((numpy7, numpy8))
print(numpy9)

[[14 28 42]
 [15 30 45]]

To stack arrays along columns, use the np.vstack function-the parameter for this function is the same as the parameter for the np.hstack function (a tuple containing the arrays you want to stack). In the example, stacking along columns created a 2×3 2-D array (this is the same outcome as stacking along the first axis).

Now, there’s another way that you can stack your array-along depth (or height). Here’s how to do so:

numpy7 = np.array([14, 28, 42])
numpy8 = np.array([15, 30, 45])
numpy9 = np.dstack((numpy7, numpy8))
print(numpy9)

[[[14 15]
  [28 30]
  [42 45]]]

Stacking along depth/height is the same gist as stacking along rows or columns-the only difference is that you’d use the np.dstack function to stack along depth/height. In this example, stacking along depth/height created a 3×2 3-D array, which is interesting because in the example where I stacked along the 2nd axis in a previous example (using the same three NumPy arrays), I got a 3×2 2-D array.

Now that we’ve discussed the basics of joining NumPy arrays, let’s discover how to do the reverse (splitting arrays).

Here’s a simple example of splitting a NumPy array:

numpy10 = np.array([100, 200, 300, 400, 500, 600, 700, 800, 900, 1000])
numpy11 = np.array_split(numpy10, 2)
print(numpy11)

[array([100, 200, 300, 400, 500]), array([ 600,  700,  800,  900, 1000])]

To split up a NumPy array, use the np.array_split function and pass in two parameters-the array you want to split up and the number of splits you want to use on the array. In this example, I have an array of 10 elements that I split in two.

Something to note when splitting areas is that the number of splits you want to use on an array doesn’t need to be divisible by the number of elements in the array. Granted, I did use two splits on a 10-element array. Watch what happens when I use three splits on the same 10-element array:

numpy10 = np.array([100, 200, 300, 400, 500, 600, 700, 800, 900, 1000])
numpy11 = np.array_split(numpy10, 3)
print(numpy11)

[array([100, 200, 300, 400]), array([500, 600, 700]), array([ 800,  900, 1000])]

The split still works, though the array isn’t split evenly (and Python automatically decides how to split the array).

Now let’s say you wanted to access one of the individual arrays. Here’s how to do so (using the 3-split example):

print(numpy11[0])
print(numpy11[1])
print(numpy11[2])

[100 200 300 400]
[500 600 700]
[ 800  900 1000]

Accessing individual arrays from a larger split NumPy array is the same as accessing elements from an individual array-in this case, the first array is index 0, the second array is index 1, and so on.

Now what if we wanted to access individual elements from these split arrays? Take a look at this example:

print(numpy11[0][1])
print(numpy11[1][1])
print(numpy11[2][1])

200
600
900

To access an individual element in each individual array, you’d need to add another indexing call. In this example, the first indexing call refers to the array itself while the second indexing call refers to the element inside the array.

Now, how would you split a multi-dimensional array? Take a look at this example:

numpy12 = np.array([[200, 400, 600, 800, 1000, 1200], [300, 600, 900, 1200, 1500, 1800]])
numpy13 = np.array_split(numpy12, 2)
print(numpy13)

[array([[ 200,  400,  600,  800, 1000, 1200]]), array([[ 300,  600,  900, 1200, 1500, 1800]])]

In this example, I am splitting a 2-D array in two-interestingly enough, both of the split arrays are still 2-D.

As you can see in the example, there are two 1-D arrays inside the 2-D array, therefore, splitting the 2-D array in two made sense. However, what if you wanted to split this 2-D array another way? Let’s see what happens when we split this array in three:

numpy12 = np.array([[200, 400, 600, 800, 1000, 1200], [300, 600, 900, 1200, 1500, 1800]])
numpy13 = np.array_split(numpy12, 3)
print(numpy13)

[array([[ 200,  400,  600,  800, 1000, 1200]]), array([[ 300,  600,  900, 1200, 1500, 1800]]), array([], shape=(0, 6), dtype=int32)]

When you split this array in three, you get both of the 1-D arrays in the 2-D arrays plus a blank array with a shape of (0, 6). If I was to split this array in four, I would’ve gotten both of the 1-D arrays plus two blank arrays with a shape of (0, 6).

A neat thing about splitting arrays is that, just like with joining arrays, you can split the arrays along a certain axis. Let’s see how axis-splitting an array works with a 2-D array:

numpy14 = np.array([[22, 44, 66, 88, 110], [33, 66, 99, 132, 165]])
numpy15 = np.array_split(numpy14, 5, axis=1)
print(numpy15)

[array([[22],
       [33]]), array([[44],
       [66]]), array([[66],
       [99]]), array([[ 88],
       [132]]), array([[110],
       [165]])]

In this example, I split the 2-D array numpy14 in five, resulting in a split array where elements from both of the 1-D arrays are stacked on top of each other in two columns. To split a NumPy array along a certain axis, specify the axis you want to split along after the number of splits you want to perform.

  • Just as with axis-joining an array, if you don’t specify an axis to split along in the np.array_split function, the array will automatically split along the first axis.
  • You can’t split along an axis beyond the scope of the array. In this example, since numpy14 is a 2-D array, you can’t split along axis=2 since a 2-D array doesn’t have a third axis.

Now, I mentioned that you can join arrays along rows, columns, and depth/height. However, did you know that each of the array-joining functions-np.hstack, np.vstack, and np.dstack-also have array-splitting counterparts-np.hsplit, np.vsplit, np.dsplit?

Let’s demonstrate the np.hsplit function first, which splits your array along rows:

numpy16 = np.array([40, 80, 120, 160, 200, 240])
numpy17 = np.hsplit(numpy16, 2)
print(numpy17)

[array([ 40,  80, 120]), array([160, 200, 240])]

In this example, I split numpy16 in two along rows; as you can see, both split arrays are displayed along a single row. Also, even though np.hsplit is the counterpart to np.hstack, np.hsplit doesn’t require a tuple as a parameter since you’re splitting a single array rather than stacking multiple arrays on top of each other.

Now let’s check out the np.vsplit function, which splits your array along columns:

numpy18 = np.array([[2010, 2011, 2012, 2013], [2014, 2015, 2016, 2017], [2018, 2019, 2020, 2021]])
numpy19 = np.vsplit(numpy18, 3)
print(numpy19)

[array([[2010, 2011, 2012, 2013]]), array([[2014, 2015, 2016, 2017]]), array([[2018, 2019, 2020, 2021]])]

Interestingly, the output for np.vsplit is displayed in the same format as the output for np.hsplit. However, keep in mind that unlike for np.hsplit, np.vsplit won’t work on 1-D arrays-you’ll need at least a 2-D array to make np.vsplit work.

Finally, let’s demonstrate the np.dsplit function, which splits your array along depth/height:

numpy20 = np.array([[[1996, 1997, 1998, 1999], [2000, 2001, 2002, 2003], [2004, 2005, 2006, 2007]]])
numpy21 = np.dsplit(numpy20, 4)
print(numpy21)

[array([[[1996],
        [2000],
        [2004]]]), array([[[1997],
        [2001],
        [2005]]]), array([[[1998],
        [2002],
        [2006]]]), array([[[1999],
        [2003],
        [2007]]])]

Unlike the output for np.hsplit and np.vsplit, the output for np.dsplit displays stacked, with elements from each 1-D array interspersed with each other. Also, for np.dsplit to work, you’ll need at least a 3-D array; 2-D arrays won’t work with np.dsplit.

Thanks for reading,

Michael

Python Lesson 20: Indexing, Slicing, and Iterating Through NumPy Arrays (NumPy pt. 3)

Hello everybody,

Michael here, and today’s lesson will be on indexing, slicing, and iterating through NumPy arrays-this is part 3 in my NumPy series.

First off, let’s discuss how to access elements in NumPy arrays. Here’s a simple example of this:

arrA = np.array([7, 14, 21, 28, 35, 42])
print(arrA[2])

21

As you can see, accessing elements in NumPy arrays is similar to accessing elements in regular Python arrays.

But that was just a simple 1-D array. How would you access elements from a 2-D array?

arrB = np.array([[9, 18, 27, 36, 45], [11, 22, 33, 44, 55]])
print(arrB[0, 2])

27

Turns out, accessing elements from a 2-D array is simple as well, except you’d use two parameters rather than just one. The first parameter specifies which dimension you’d like to search in and the second parameter specifies the element in that dimension you’d like to retrieve. In this example, I’m retrieving the third element from the first dimension (recall that Python arrays have an indexing system that starts with 0).

Simple enough right? Now let’s see how we can access elements from a 3-D array:

arrC = np.array([[[12, 24, 36, 48, 60], [13, 26, 39, 52, 65], [14, 28, 42, 56, 70]], [[20, 40, 60, 80, 100], [21, 42, 63, 84, 105], [22, 44, 66, 88, 110]]])
print(arrC[1, 2, 1])

44

This one is a bit more complex since there is more to break down, but there’s an easy way to explain 3-D array indexing. First of all, since there are three dimensions, there are obviously three parameters you’d need to use.

Aside from that, the first parameter is 1, which tells Python to look for an element in the second dimension. The second parameter is 2, which narrows the search down to the third array within the second dimension. The third parameter is 1, which further narrows down the search to the second element within the third array within the second dimension-the element that is returned from this search is 44.

Now let’s demonstrate how to perform negative (or reverse) indexing on a 1-D array:

arrD = np.array([0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4])
print(arrD[-2])

1.2

Reverse indexing on a NumPy array is the same as reverse indexing on a regular Python array. In this example, I am retrieving the element at index -2 (1.2), which refers to the second index from the right of the array.

Now here’s reverse indexing at work on a multi-dimensional array:

arrE = np.array([[-12, -9, -6, -3, 0], [-28, -21, -14, -7, 0], [-32, -24, -16, -8, -0]])
print(arrE[1, -1])

0

Reverse indexing on a multi-dimensional array is also fairly simple. In this example, I am retrieving the last element (or rightmost element) from the second array-within-an-array.

One more neat thing about NumPy array indexing is that it allows you to perform basic arithmetic on certain elements of the array. Here’s an example of that:

arrF = np.array([2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048])
print(arrF[2]+arrF[-2])
print(arrF[3]-arrF[0])
print(arrF[-1]*arrF[1])
print(arrF[2]/arrF[0])
print(arrF[2]**2)

1032
14
8192
4.0
64

In this example, I am performing the four basic arithmetic operations (addition, multiplication, subtraction, and division) and exponentiation (raising a number to a certain power) on certain elements in the array; I’m using different elements for each operation.

Next up, let’s discuss slicing an array. In the context of programming, array slicing is when you take certain element(s)/portion(s) from an array to create a new, smaller array. Here’s a simple example of NumPy array slicing:

arrG = np.array([1982, 1986, 1990, 1994, 1998, 2002, 2006, 2010, 2014, 2018, 2022])
print(arrG[0:5])

[1982 1986 1990 1994 1998]

In this example, Python is printing out the first five elements of arrG. Recall that when retrieving a range of elements from an array, the element corresponding to the end index is not included-so in this case, the 6th element wasn’t included.

Now, just as with regular Python arrays, you can slice arrays without a start index or end index-you just need one of these indexes to slice an array. Let me show you what I’m talking about:

print(arrG[5:])

[2002 2006 2010 2014 2018 2022]

In this example, I used [5:] to retrieve the sixth element in the array onward.

print(arrG[:5])

[1982 1986 1990 1994 1998]

In this example, I used [:5] to retrieve all elements in the array up to BUT NOT INCLUDING the sixth element.

The next topic I want to discuss is negative (or reverse) slicing; just as you can perform reverse indexing on an array, you can also perform reverse slicing on an array as well.

Let me show you an example of reverse slicing:

print(arrG[-4:-1])

[2010 2014 2018]

Reverse slicing has the same idea as reverse indexing-indexing starts at -1, which corresponds to the rightmost element in the array (2022 in the case of arrG).

If I wanted to include 2022 in my reverse slicing, I would’ve used this code:

print(arrG[-4:])

[2010 2014 2018 2022]

Had I used the line print(arrG[-4:0]), I would’ve recieved an empty array as output.

Now let’s explore a new array concept that I haven’t discussed here before-the concept of step. The step concept allows you to set a step for the slicing (e.g. return every other element, etc.).

Let’s see a simple example of slicing an array by step:

print(arrG[::3])

[1982 1994 2006 2018]

In this example, I retrieved every third element from arrG starting with the first element-1982. In other words, I retrieved the 1st, 4th, 7th, and 10th elements from arrG.

Now what if I wanted to perform step slicing only on a certain portion of the array:

print(arrG[0:6:2])

[1982 1990 1998]

In this example, I am performing step slicing on the 1st-7th elements of arrG (which correspond to indexes 0-6) by retrieving every other element from this portion of the array. If you want to perform step slicing on a portion of the array (rather than the whole array), you need to specify the portion of the array you would like to slice in the first two parameters-in this example, since I wanted to perform my slicing on elements 1-7, I used 0 and 6 as the first two parameters in the above example.

Now, how would we slice multi-dimensional arrays? Here’s an example (using a 2-D array):

arrH = np.array([[22, 44, 66, 88, 100], [33, 66, 99, 132, 165]])
print(arrH[0, 1:4])

[44 66 88]

In this example, the slicing function (referring to the line arrH[0, 1:4]) has two parameters rather than one-the first parameter contains the dimension where you wish to perform the slicing and the second parameter contains the range of elements you would like to retrieve (up to but not including the end index).

In the slicing function, I retrieved the 2nd through 4th elements (indexes 1-4) in the first dimension of the array (index 0). Step slicing also works for multi-dimensional arrays; for instance, the line print(arrH[0, ::2]) would have worked here (and it would have returned the 1st, 3rd, and 5th elements from the 1st dimension).

Now, how would we go about slicing a 3-D array? Take a look at the example below:

arrI = np.array([[[0, -9, -18, -27, -36], [0, -8, -16, -24, -32], [0, -7, -14, -21, -28]]])
print(arrI[0, 1, 0:3])

[  0  -8 -16]

In order to slice a 3-D array, you’d need 3 parameters for the slicing function-the first parameter represents the 2-D array within the 3-D array where you wish to perform the slicing, the second parameter represents narrows the slicing focus to a 1-D array within the 2-D array you selected, and the third parameter represents the range of elements in the 1-D array that you want to retrieve.

In this example, the first parameter is 0, which tells Python to start the array slicing in the first (and only) 2-D array within the 3-D array. The next parameter is 1, which tells Python to narrow the array slicing focus to the second 1-D array within the 2-D array. The last parameter is 0:3, which tells Python to retrieve the 1st-3rd elements from the second 1-D array within the 2-D array.

Last but not least, let’s explore how to iterate through NumPy arrays (which involves looping through all array elements).

To iterate through NumPy arrays, a for loop while do the trick. Here’s an example of iterating through a 1-D array:

arrJ = np.arange(10, 75, 5, int)
print(arrJ)

[10 15 20 25 30 35 40 45 50 55 60 65 70]
for x in arrJ:
    print(x)

10
15
20
25
30
35
40
45
50
55
60
65
70

To iterate through a simple 1-D array, all you need is the lines for x in [array name]: print(x). Not hard at all.

Let me discuss the arange function I used. The arange function is simply another way to create a NumPy array; this function has 4 parameters-the first element in the array, the last element in the array, the increment/decrement that is used to generate each element of the array, and the number type of the elements in the array (you usually use int or float).

Now let’s demonstrate iterating through a 2-D array:

arrK = np.array([[1, 1, 2, 3, 5, 8], [13, 21, 34, 55, 89, 144]])

for y in arrK:
    print(y)

[1, 1, 2, 3, 5, 8]
[13, 21, 34, 55, 89, 144]

Looks good right? The for loop syntax I used might work if you want to iterate through each 1-D array but this doesn’t work if you want to iterate through each element (known in programming lingo as a scalar) in each array-within-an-array.

Here’s how to iterate through each 1-D array in the 2-D array:

for x in arrK:
    for y in x:
        print(y)

1
1
2
3
5
8
13
21
34
55
89
144

A simple way to iterate through multi-dimensional arrays is the use of nested for loops. In this example, the outer for loop iterates through the main 2-D array while the inner for loop iterates through each element in each 1-D array within the 2-D array.

This is a great way to iterate through multi-dimensional arrays, but here’s an even better way to iterate through multi-dimensional arrays:

for x in np.nditer(arrK):
    print(x)

1
1
2
3
5
8
13
21
34
55
89
144

Simply using NumPy’s nditer function (and passing your array as a parameter) will iterate through each element in your array with just a simple line of code. The nditer function is super helpful because it allows you to iterate through a multi-dimensional array with a single line of code; recall that NumPy arrays can go up to 32-D, which would make for a very cumbersome-to-write nested for loop.

Now check out the two methods when used for iterating through a 3-D array. Which do you think is more efficient?

arrL = np.array([[[2, 4, 8, 16, 32], [3, 9, 27, 81, 243], [4, 16, 64, 256, 1024]]])

for x in arrL:
    for y in x:
        for z in y:
            print(z)

2
4
8
16
32
3
9
27
81
243
4
16
64
256
1024
for x in np.nditer(arrL):
    print(x)

2
4
8
16
32
3
9
27
81
243
4
16
64
256
1024

As you can see, both methods work great for iterating through arrL. However, the nditer method is more efficient and accomplishes the iterating with one line of code (the nested for loops method uses three lines of code).

Finally, I wanted to discuss step iterating through an array. Just as you can perform step slicing on an array, you can also step iterate through an array too:

for x in np.nditer(arrL[:, ::3]):
    print(x)

2
4
8
16
32

In this example, I am iterating through arrL but only returning every third element.

  • For step slicing and step iteration, remember that the step starts with the first element in the array unless you specify otherwise!

Thanks for reading,

Michael

Python Lesson 19: NumPy Array Shaping (NumPy pt.2)

Hello everybody,

Michael here, and today’s post will on how to manipulate the shape of arrays in NumPy-this is the second lesson in my NumPy series.

Now, before we get started, let’s remember to import NumPy to our IDE using the line import numpy as np.

  • Remember to pip install numpy if you haven’t done so already! Also, if you’re not sure if you’ve already got NumPy, use the pip list command to check.

Now that the import has been taken care of, let’s first demonstrate how to find the shape of a NumPy array:

arrayA = np.array([[0.5, 1, 1.5, 2, 2.5, 3], [3.5, 4, 4.5, 5, 5.5, 6]])
print(arrayA.shape)

(2, 6)

In this example, I created a 2-D array with six elements in each array. To find out the array’s shape, I called the array’s .shape function. The shape is shown as (2, 6), which means that this array has 2 dimensions and 6 elements per dimension.

Now how would we reshape this array? Let’s say that we wanted to turn this 2-D array into a 3-D array. Here’s how we would do that:

arrayB = arrayA.reshape(3, 4)
print(arrayB.shape)
print(arrayB)

(3, 4)
[[0.5 1.  1.5 2. ]
 [2.5 3.  3.5 4. ]
 [4.5 5.  5.5 6. ]]

To reshape a NumPy array, use the .reshape function on your current array and change the parameters of the .reshape function to the dimensions you want the new array to have.

In this example, I changed the shape of arrayA to 3 X 4, which means that the new array (stored in the arrayB variable) will have 3 dimensions with 4 elements in each dimension.

Look, the .reshape function is quite versatile, but you can just use any two numbers as the parameters here. Here’s what happened when I tried reshaping arrayA to 5×7:

arrayC = arrayA.reshape(5, 7)
print(arrayC)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-70ed58b6935b> in <module>
----> 1 arrayC = arrayA.reshape(5, 7)
      2 print(arrayC)

ValueError: cannot reshape array of size 12 into shape (5,7)

Since 5 and 7 aren’t factors of 12 (the array size), Python couldn’t reshape the array to 5×7.

In order to know all of the possible ways that you can shape the array, count all of the elements in the array (12 in the case of arrayA) then make a note of all of the factors of that amount. Since there are 12 elements in arrayA, the possible shapes arrayA can take include (1, 12), (2, 6), and (3, 4), as these are all number pairs that have a product of 12.

  • (12, 1), (6, 2), and (4, 3) work as well-recall that, from the Commutative Property of Multiplication, you can switch the order of the factors around and get the same product. However, keep in mind that the dimensions will be different if you switch the order of the numbers.
  • Also, when reshaping the array, keep in mind that dimensions are capped at size 32 (I mentioned this in the previous lesson)

Now I’ve demonstrated how to reshape a NumPy array with a standard number-pair tuple. However, did you know that you can also include an unknown dimension. Here’s how that works:

arrayD = np.array([0, 3, 6, 9, 12, 15, 18, 21, 24, 27])

arrayE = arrayD.reshape(2, -1)
print(arrayE)

[[ 0  3  6  9 12]
 [15 18 21 24 27]]

Yes, NumPy allows you to use -1 as a shape parameter in the .reshape method. But what does this mean? -1 simply represents an unknown dimension that you want NumPy to figure out. How does NumPy figure out this unknown dimension? It looks at the length of the array and the known dimension(s) in order to determine how to best shape the array. In this example, NumPy sees that the known dimension is 2, so the array is shaped to have two rows with five elements per row-the array has 10 elements, so 10 divided by 2 (# of rows) equals 5 (# of elements per row).

Now what if I had switched the order of the parameters in the .reshape function to (-1, 2). What would we get?

arrayF = arrayD.reshape(-1, 2)
print(arrayF)

[[ 0  3]
 [ 6  9]
 [12 15]
 [18 21]
 [24 27]]

If I switch the order of the .reshape function parameters to (-1, 2), the array would still have 10 elements but instead of 2 rows-by-5 columns, I get an array with 5 rows-by-2 columns.

As you can see, using the unknown dimension trick of -1 works just great with two shape parameters. However, the -1 trick also works with three shape parameters. Take a look at the example below:

arrayG = arrayD.reshape(2, 5, -1)
print(arrayG)

[[[ 0]
  [ 3]
  [ 6]
  [ 9]
  [12]]

 [[15]
  [18]
  [21]
  [24]
  [27]]]

In this example, using -1 as a third shape parameter splits the array into two 5×1 arrays.

Now what if I still used three shape parameters, but placed the -1 as the second shape parameter?

arrayH = arrayD.reshape(2, -1, 5)
print(arrayH)

[[[ 0  3  6  9 12]]

 [[15 18 21 24 27]]]

In this example, using -1 as the second shape parameter still splits the array in two, except you get two 1×5 arrays.

  • You can only have one unknown dimension in every .reshape function!
  • For all of the known dimensions, you can only include factors of the array length. In other words, since arrayD has a length of 10, you could only use 1, 2, 5, or 10 for the known dimension(s), as these are all the factors of 10.

The next NumPy trick I will show you is how to flatten arrays. In NumPy, you can convert multi-dimensional arrays into 1-D arrays-this is called flattening the array-with a simple line of code. Here’s how to flatten a NumPy array:

arrayI = np.array([[4, 8, 12], [5, 10, 15], [6, 12, 18]])
arrayJ = arrayI.reshape(-1)
print(arrayJ)

[ 4  8 12  5 10 15  6 12 18]

To flatten a NumPy array, simply use the .reshape function with the -1 parameter. That’s it. However, for the flattening to work, only use the -1 shape parameter-don’t include other shape parameters!

Now I know I said that the array-flattening trick works with multi-dimensional arrays, but it also works with 0-D arrays. Here’s an example of this:

arrayK = np.array(7)
arrayL = arrayK.reshape(-1)
print(arrayL)

[7]

As you can see, my 0-D was successfully turned into a 1-D array.

Last but not least, I want to show you two special NumPy functions that you will likely encounter if you’re learning about more advanced Python topics (e.g. computer vision)-.zeros and .ones. These functions allow you to create arrays of zeros and ones, respectively (recall that 0 and 1 are the two elements in the binary number system from Java Lesson 5: Java Numbering Systems).

First, let’s create an array of zeros:

arrayM = np.zeros((4,4), int)
print(arrayM)

[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

In this .zeros function, I used two parameters-a tuple to specify the shape of the array (4×4) and a value to specify the type to store the zeros (int).

Let’s say I didn’t specify a value type for the zeros. How would they be stored on the program?

arrayM = np.zeros((4,4))
print(arrayM)
print(arrayM.dtype)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
float64

In this example, I didn’t set a value type for the zeros, so they are by default stored as type float64-which is a 64-bit floating point number.

  • If you want to check the type of all the values used in the NumPy array, use the .dtype function, not .type.
  • The number of shape parameters you use for the shape tuple in the .zeros and .ones functions indicates the number of dimensions your array will have. In the example above, I used two shape parameters for the shape tuple, therefore my array has two dimensions.

Now let’s create an array of ones:

arrayN = np.ones((5,9), int)
print(arrayN)

[[1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 1]]

In this example, I created a 5×9 array of ones stored as type int.

Thanks for reading,

Michael