NumPy

NumPy


  Machine Learning in Python

Numpy is a python package specifically designed for efficiently working on homogeneous n-dimensional arrays . Since array level operations are highly mathematical in nature, most of numpy is written in C and wrapped with Python. This is the key to numpy’s success.

Just enough Numpy

Additional Reading

Numpy Challenges

FAQ

Say there are 100 numbers. You can fold them into  2d array with reshape(5,20)  . We know that 5 x 20 = 100

What if there are 280 numbers in an array and you want to them into a 2d array of 5 rows. Can you guess the columns ? You would have to do the division 280/5 = 36 . So, you then say reshape(5,36).

Instead of doing the division yourself, you can use the negative index and say reshape(5,-1). Numpy will automatically interpret the second index to be 36.


Install numpy

Before you do anything with numpy, you would have to first install it ( unless you have other data science distributions like Anaconda or Canopy installed ). Installing numpy is as simple as

# pip install numpy

Why NumPy

Let’s do a simple numeric operation – Summing up the first million numbers. Let’s first do it in python and then in NumPy to understand what NumPy brings to the table.

# without numpy
import time 

sum = 0

start_time = time.time()

for num in range(10000000) :
    sum = sum + num
    
print ( "sum = ", sum)

end_time = time.time()

python_time = end_time - start_time

print ( "time taken = ", python_time)
sum =  49999995000000
time taken =  3.329150438308716
# with numpy
import numpy as np
import time

sum = 0

start_time = time.time()

numbers = np.arange(10000000)

sum = np.sum(numbers, dtype = np.uint64)
print ( "sum = ", sum)

end_time = time.time()

numpy_time = end_time - start_time
factor = python_time / numpy_time

print ( "time taken = ", (end_time - start_time))

print ( "numpy is ", factor , " times faster than standard python")
sum =  49999995000000
time taken =  0.042661190032958984
numpy is  78.03698011557334  times faster than standard python

As you can see, numpy is 45 times faster than standard python. Of course the number may slightly vary based on the power of your computer. Right off the bat, you can see that NumPy brings a lot of value to the table. That level of performance improvement – all within the comfort of Python. That is the power of NumPy.

The power of NumPy lies in leveraging low level C language API to increase the performance of Numeric Operations in Python.


n-dimensional array

This is the core data structure in numpy. We will explore how useful it is and what you can do with it pretty soon. Let’s create a simple 1 dimensional array with just 10 numbers

import numpy as np

a = np.array([1,2,3,4,5,6,7,8,9,10])
a
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Let’s put a second dimension to it

b = np.array( [[1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10],
               [11,12,13,14,15,16,17,18,19,20]])
b
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])

Create an array from list

An array can be created from a standard python list. All you have to do is use the array ( ) function of NumPy and pass the list to it.

numbers = [1,2,3,4,5,6,7,8,9,10,11,12]
a = np.array(numbers)
a
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

You can create a 2-d array as well from a list.

a1 = [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10]
a2 = [11,12,13,14,15,16,17,18,19,20]
b = np.array( [a1,a2])
b
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]]) 

np.array ([[1,2,3],
           [4,5,6],
           [7,8,9]])
Question – The code above creates a
2-d array of shape 3 x 3
3-d array of shape 3 x 3
3-d array
1-d array of shape 3 x 3

numbers_1 = [1,2,3]
numbers_2 = [4,5,6]
numbers_3 = [7,8,9]
a = np.array([numbers_1, numbers_2, numbers_3])
Question – The code above creates a
2-d array of shape 3 x 3
3-d array of shape 3 x 3
3-d array
1-d array of shape 3 x 3

numbers_1 = [1,2,3]
numbers_2 = [4,5,6]
numbers_3 = [7,8,9]
a = np.array([numbers_1 + numbers_2 + numbers_3])
Question – The code above creates a
2-d array of shape 1 x 9
1-d array of length 9
3-d array
1-d array of shape 3 x 3

shape ( )

How do you know the number of dimensions ? Use the shape function to tell you the shape of the array.

b = np.array( [[1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10],
               [11,12,13,14,15,16,17,18,19,20]])
b
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
b.shape
(2, 10)

meaning, there are 2 rows and 10 columns.


numbers_1 = [1]
numbers_2 = [4]
numbers_3 = [7]
a = np.array([numbers_1, numbers_2, numbers_3])
Question – The code above creates a
2-d array of shape 3 x 1
1-d array of shape 3
2-d array of shape 3 x 3

arange ( )

Like the standard python function range ( ) , numpy has a similar function called arange ( )

numbers = np.arange(1,51)
numbers
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50])
numbers.shape
(50,)

a = range(100)

b = np.array(a)
Question – b is a
2-d array of shape 100 x 1
1-d array of shape 100

reshape ( )

You can now use the reshape function to reshape the data into any number of dimensions you like. For example, you can reshape this into any of the following combinations in 2d. eg.,5 x 102 x 25etc

numbers.reshape(5,10)
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
       [31, 32, 33, 34, 35, 36, 37, 38, 39, 40],
       [41, 42, 43, 44, 45, 46, 47, 48, 49, 50]])
numbers.reshape(2 , 25)
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25],
       [26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
        42, 43, 44, 45, 46, 47, 48, 49, 50]])

What happens when you try to reshape it to a 2 x 50 array ? Basically that is not possible, and naturally NumPy throws up an error message

numbers.reshape(2,50)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-21-27db37f04a26> in <module>
----> 1 numbers.reshape(2,50)

ValueError: cannot reshape array of size 50 into shape (2,50)

Sometimes you need to reshape an array knowing just its columns and not its rows ( or vice-versa ). In cases like that NumPy provides a shortcut.

numbers = np.arange(1,13)

numbers.reshape(-1,2)
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10],
       [11, 12]])

You can do the same for columns as well.

numbers.reshape(2,-1)
array([[ 1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12]])

From file

You can also read data from file. For example, download the weight vs height file here.

It is a csv (comma separated values) file of height vs weight. This data set was downloaded from kaggle at https://www.kaggle.com/mustafaali96/weight-height

height-vs-weight

To read the file, all we have to do is to use the genfromtxt ( ) function.

data = np.genfromtxt("height-weight-only.csv",delimiter=",")
array([[   nan,    nan],
       [ 73.85, 241.89],
       [ 68.78, 162.31],
       ...,
       [ 63.87, 128.48],
       [ 69.03, 163.85],
       [ 61.94, 113.65]])

The key parameters in the signature of this function look like this.

numpy.genfromtxt( fname = "file name",  # file name to be read
                  dtype = float,        # default data type is float, but can be configured.
                  comments = "#",       # indicates comments - therefore should not be read
                  delimiter = None,     # specifies the character used to delimit values in the array
                  skip_header = 0,      # the number of lines to skip at the beginning of the file
                  missing_values = None )

You see a nan (not a number ) in the first row because the first row contains text headers and so cannot be read as floats. To skip the first row, just use the skip_header switch.

data = np.genfromtxt("height-weight-only.csv",delimiter=",", skip_header = 1)


array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
Question – Which of the following code can convert the array above to a 2-d array of shape 9 x 1
array.reshape ( 9, 1 )
array.reshape ( 9, -1 )
array.reshape ( 9, )

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
Question – Which of the following code can convert the array above to a 2-d array of size 1 x 9 ( 1 row and 9 columns )
array.reshape ( -1, 9 )
array.reshape ( 1, 9 )
array.reshape ( 9, )

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
Question – Which of the following code can convert the array above to a 1-d array of size 9.
array.reshape ( 9 )
array.reshape ( 9, -1 )
array.reshape ( 9, 1 )

a = array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

a.reshape(3,4)
Question – The code above will result in
a 2-d array of shape 3 x 4
a 2-d array of shape 3 x 4 with 3 elements filled as NA
syntax error

Array Operations

This is where we get the sweet surprise. Array operations are element wise. Let’s compare it to a list and you will see the difference

Element-wise Operations

a = list(range(11))
b = list(range(11,21))
a + b
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
a1 = np.arange(1,11)
b1 = np.arange(11,21)
a1 + b1
array([12, 14, 16, 18, 20, 22, 24, 26, 28, 30])

Element wise operations are not just across 2 arrays. You can even do simple unary operations like power, multiplications etc. Essentially, we are eliminating the for loop.

a = list(range(11))
a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a12 = pow(a1,2)
a12
array([  1,   4,   9,  16,  25,  36,  49,  64,  81, 100], dtype=int32) 

Array Multiplication

a13 = a1 * 3
a13
array([ 3,  6,  9, 12, 15, 18, 21, 24, 27, 30])

a = np.array([1,2,3])
b = np.array([1,2])

c = a + b
Question – The code above will result in
NumPy array of [2,4,3]
syntax error

a = np.array([1,2,3])
b = np.array([1])

c = a + b
Question – The code above will result in
NumPy array of [2,3,4]
syntax error

a = np.array([1,2,3])
b = np.array([1])

c = ( a + b ) ** 2
Question – The code above will result in
NumPy array of [4,9,16]
syntax error

a = np.array([1,2,3])
b = np.array([1])

c = ( a + b ) % 2
Question – The code above will result in
NumPy array of [0,1,0]
syntax error

a = np.array([1,2,3])
b = np.array([1])

c = ( a + b ) %% 2
Question – The code above will result in
NumPy array of [0,1,0]
syntax error

Aggregate Operations

sum ( )
a1 = np.arange(1,11)
print ( a1 )
a1.sum()
[ 1  2  3  4  5  6  7  8  9 10]
55
min ( ) & max ( )
a1.min()
1
a1.max()
10
len ( )
len(a1)
10

Aggregate Operations along an axis

a = np.arange(1,101).reshape(10,10)
a
array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

Sum across each of the axis

a.sum(axis=1)
array([ 55, 155, 255, 355, 455, 555, 655, 755, 855, 955]) 
a.sum(axis=0)
array([460, 470, 480, 490, 500, 510, 520, 530, 540, 550])

Similarly, you can do a min ( ) or max ( ) across any axis

a.min( axis = 1 )
array([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])
a.min ( axis = 0 )
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Array indexing & Slicing

Array Indexing

Indexing a 1-d array is exactly similar to a list

To get a particular index, just use the square brackets notation ( like a list )

b[5]
6

You can use negative indexing as well.

b[-5]
6

Indexing a 2d array is just as simple. Since the array is 2 dimensional now, you have to use 2 indices. One along each axis.

a[4,7]
48

a = np.array([[ 3,  6,  9],
              [12, 15, 18],
              [21, 24, 27]])
print ( len(a) )
Question – The output of the code above is
3
9
1

a = np.array([[ 3,  6,  9],
              [12, 15, 18],
              [21, 24, 27]])
print ( a.max() )
Question – The output of the code above is
27
9

a = np.array([[ 3,  6,  9],
              [12, 15, 18],
              [21, 24, 27]])
print ( a.max() )
Question – The output of the code above is
27
9

a = np.array([[ 3,  6,  9],
              [12, 15, 18],
              [21, 24, 27]])
print ( a[1,2] )
Question – The output of the code above is
18
24

a = np.array([[ 3,  6,  9],
              [12, 15, 18],
              [21, 24, 27]])
print ( a[1,-1] )
Question – The output of the code above is
18
24

a = np.array([[ 3,  6,  9],
              [12, 15, 18],
              [21, 24, 27]])
print ( a[1,-1] )
Question – The output of the code above is
18
24

Array Slicing

Slicing a 1-d array is also similiar to a list. Use a slice in place of a number for indexing

b
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
b[3:7]
array([4, 5, 6, 7])

Slicing a 2-d array extends the same functionality across all the axis

a
array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])
a[2:5, 3:8]
array([[24, 25, 26, 27, 28],
       [34, 35, 36, 37, 38],
       [44, 45, 46, 47, 48]])

You can very well use a combination of slicing and indexing

a[4,3:8]
array([44, 45, 46, 47, 48])

If you wanted to specify all the elements across a particular axis, just use a colon (:) without anything before or after.

So, both of these are equivalent.

# Expression 1
a[4,0:10]
array([41, 42, 43, 44, 45, 46, 47, 48, 49, 50])
# Expression 2
a[4, : ]
array([41, 42, 43, 44, 45, 46, 47, 48, 49, 50])
a[[1,4], :]
array([[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [41, 42, 43, 44, 45, 46, 47, 48, 49, 50]])

What if you wanted multiple slices.. like so ?

a[ [1,4,8], : ]
array([[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
       [81, 82, 83, 84, 85, 86, 87, 88, 89, 90]])

a = np.array([ [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
               [41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
               [81, 82, 83, 84, 85, 86, 87, 88, 89, 90]])
print ( a[2,3:7] )
Question – The code above prints
[84 85 86 87]
[44, 45, 46, 47]

a = np.array([ [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
               [41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
               [81, 82, 83, 84, 85, 86, 87, 88, 89, 90]])
print ( a[:,:] )
Question – The code above prints
The entire array
Prints nothing
Syntax error


Array Manipulation

So, far we have seen how to slice data from a NumPy array or use aggregate operations along an axis. In this section, we will learn about array manipulations.

Append rows or columns

Say we have a 2-d array of shape 4 x 5.

import numpy as np

numbers = np.arange(1,21)
numbers = numbers.reshape(4,5)
numbers

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

What if we wanted to insert another row at the end ? Say this row.

extras = np.array([21,22,23,24,25])

numbers = np.append(numbers,[extras],axis=0)

print ( numbers )

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]]

Say if you wanted to append it as a column,

j = extras.reshape(5,-1)
j
array([[21],
       [22],
       [23],
       [24],
       [25]])
j.shape

(5, 1)
numbers = np.append(numbers,extras.reshape(5,-1),axis=1)
print ( numbers )
[[ 1  2  3  4  5 21]
 [ 6  7  8  9 10 22]
 [11 12 13 14 15 23]
 [16 17 18 19 20 24]
 [21 22 23 24 25 25]]

Insert rows or columns

What if you wanted to insert a column in the middle ? Like so ?

In this case, you should use the insert ( ) function.

import numpy as np

numbers = np.arange(1,21)
numbers = numbers.reshape(4,5)
numbers

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])
extras = np.array([21,22,23,24,25])
print ( extras)
[21 22 23 24 25]
numbers_new = np.insert(numbers,2,extras,axis=0)
numbers_new
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [21, 22, 23, 24, 25],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

Similarly, you can insert a column as well.

numbers = np.arange(1,21)
numbers = numbers.reshape(4,5)
numbers
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])
extras = np.array([21,22,23,24])
print ( extras)
[21 22 23 24]
numbers_new = np.insert(numbers,3,extras,axis=1)
print ( numbers_new)

[[ 1  2  3 21  4  5]
 [ 6  7  8 22  9 10]
 [11 12 13 23 14 15]
 [16 17 18 24 19 20]]


Delete rows or columns

To delete a row or column use the delete ( ) function. For example, to delete the 3rd column below,

numbers = np.arange(1,21)
numbers = numbers.reshape(4,5)
numbers
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])
numbers_new = np.delete(numbers,2,axis=1)
print ( numbers_new )
[[ 1  2  4  5]
 [ 6  7  9 10]
 [11 12 14 15]
 [16 17 19 20]]

numbers = np.arange(1,21)
numbers = numbers.reshape(4,5)
numbers
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

To delete the second row below,

numbers = np.arange(1,21)
numbers = numbers.reshape(4,5)
numbers

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])
numbers_new = np.delete(numbers,1,axis=0)
print ( numbers_new )
[[ 1  2  3  4  5]
 [11 12 13 14 15]
 [16 17 18 19 20]]

NumPy Datatypes

Numpy supports all of the following datatypes.

NumPy datatypes

If you are purely from a python background, some of these terms must be confusing. That is because, NumPy is designed to be an effective Numerical computation library. And because it is built in C, it makes heavy use of the platform specific datatypes to make the code run efficiently.

If you are creating a NumPy array from scratch, by default it creates an array with floats. Like so.

np.empty((2, 2))
array([[8.3810358e-312, 0.0000000e+000],
       [0.0000000e+000, 0.0000000e+000]])

Additional Reading

Meshgrid

Meshgrid is a useful feature of NumPy when creating a grid of co-ordinates. The function of meshgrid is really simple. Say you have a list of x and y co-ordinates

import numpy as np

x = np.arange(1,10)
y = np.arange(1,10)

Let’s plot it to see how it looks like.

import matplotlib.pyplot as plt

plt.scatter(x,y)
plt.savefig("scatter-plot.png")

What if you want all the co-ordinates in between ? like so..

meshgrid ( ) is a convenience function in numpy that can generate all the points in the grid.

xx,yy = np.meshgrid(x,y)
print(xx)
print(yy)
[[1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]
 [1 2 3 4 5 6 7 8 9]]
[[1 1 1 1 1 1 1 1 1]
 [2 2 2 2 2 2 2 2 2]
 [3 3 3 3 3 3 3 3 3]
 [4 4 4 4 4 4 4 4 4]
 [5 5 5 5 5 5 5 5 5]
 [6 6 6 6 6 6 6 6 6]
 [7 7 7 7 7 7 7 7 7]
 [8 8 8 8 8 8 8 8 8]
 [9 9 9 9 9 9 9 9 9]]


Now, if you plot all of the elements on a scatter plot, you get this.

import matplotlib.pyplot as plt
%matplotlib inline

plt.scatter(xx,yy)

This can be used in conjunctin with matplotlib’s contour or contourf functions to evaluate behaviour of functions over a grid. For example, if you want to visualize a circle, just create another variable z that is a function of x and y. The equation of a circle is,

z = xx**2 + yy**2

print(z)
[[  2   5  10  17  26  37  50  65  82]
 [  5   8  13  20  29  40  53  68  85]
 [ 10  13  18  25  34  45  58  73  90]
 [ 17  20  25  32  41  52  65  80  97]
 [ 26  29  34  41  50  61  74  89 106]
 [ 37  40  45  52  61  72  85 100 117]
 [ 50  53  58  65  74  85  98 113 130]
 [ 65  68  73  80  89 100 113 128 145]
 [ 82  85  90  97 106 117 130 145 162]]
plt.contour(xx,yy,z,levels=[10,20,30,40,50,60,70,80,90,100])
<matplotlib.contour.QuadContourSet at 0x1295f150>

Each of these lines represent the same z value. For example, the innermost line (in purple) shows all the values where the level is 10. In other words, it is essentially mapping all the points ( x and y ) that result in a z value of 10.

If you want to fill the contours, use contourf function.

plt.contourf(xx,yy,z,levels=[10,20,30,40,50,60,70,80,90,100])
<matplotlib.contour.QuadContourSet at 0x129a3e90>


Squeeze

Squeeze function let’s you get rid of useless dimensions in NumPy arrays. For example, look at the following NumPy array.

x = np.arange(9).reshape(1,9) 
x.shape
(1, 9)
array([[0, 1, 2, 3, 4, 5, 6, 7, 8]])

A series of numbers (0 – 9) can easily be represented as a vector of size 9 – just a 1 dimensional array. However, the additional dimension here is useless. If you want to remove the extra fictitious dimension, use the squeeze function.

The data here can be seen visually either way

  • as a 1-d vector
  • or as a 2-d vector
numpy squeeze

If the additional dimension does not add any value, use the squeeze function to remove it.

np.squeeze(x).shape
(9,)

Ravel

Flattens an array. Look at the visual below.

NumPy ravel ( ) function
x = np.arange(9).reshape(3,3)
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
x.ravel()
array([0, 1, 2, 3, 4, 5, 6, 7, 8])

NumPy Challenges

numbers_ar = np.array([0,1,2,0,3,4,0,5,6])

Replace all zeros in a NumPy array above with ones (1).

Hint : Use Boolean mask
code
numbers[numbers == 0] = 1

numbers_1 = np.array([1,2,3,4,5])
numbers_2 = np.array([6,7,8,9,10])

# output should look like this
[[ 1,  6],
 [ 2,  7],
 [ 3,  8],
 [ 4,  9],
 [ 5, 10]])

Stack NumPy arrays vertically.

Hint : Try the stack function
code
numbers_1 = np.array([1,2,3,4,5])
numbers_2 = np.array([6,7,8,9,10])

numbers = np.stack([numbers_1, numbers_2], axis=1)

import numpy as np

numbers = np.array ([[1,2,3],
                     [4,5,6]])
## numbers_2 = a copy of the array above

Create a copy of a numpy array

Hint : Try the copy () function
code
import numpy as np

numbers = np.array ([[1,2,3],
                                    [4,5,6]])
numbers_2 = numbers.copy()

numbers = np.array([[1,np.nan,3,4,5],
                    [6,7,8,np.nan,10]])

Find all the missing values (nan) in a NumPy array

Hint : Use the isnan function
code
numbers = np.array([[1,np.nan,3,4,5],
                    [6,7,8,np.nan,10]])

np.isnan(numbers)

grades = np.array([3.0, 3.5, 4.0])

# output
array([4.0 , 3.5, 3.0 ])

How to reverse a NumPy array

Hint : Use the indexing function with a blank beginning and end value, but use a negative iterator at the end
code
grades = np.array([3.0, 3.5, 4.0])

grades[::-1]

grades = np.array([3.0, 3.5, 4.0, 3.3, 3.2, 4.0])

# output
array([3.0 , 4.0 , 3.2])

How to extract alternate values a 1-d NumPy array

Hint : Use the indexing function with a blank beginning and end value, but use 2 as an iterator at the end
code
grades = np.array([3.0, 3.5, 4.0, 3.3, 3.2, 4.0])

grades[::2]

numbers = np.array([1,2,3,4,5])

How to square all the elements the array

Hint : Use the pow or ** function
code
numbers = np.array([1,2,3,4,5])

pow(numbers, 2)

#output
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.],
       [ 7.,  8.],
       [ 9., 10.]])

Create a NumPy array shown above without using a list of numbers

Hint : Use the linspace function and reshape
code
np.linspace(1,10, 10).reshape(5,2)

#output
array([[ 1.,  6.],
       [ 2.,  7.],
       [ 3.,  8.],
       [ 4.,  9.],
       [ 5., 10.]])

Create a NumPy array shown above without using a list of numbers

Hint : Use the linspace function and reshape. Pay attention to the order flag of the reshape function.
code
np.linspace(1,10, 10).reshape(5,2,order="F")

import numpy as np

grades = np.array([[3.0, 3.5, 4.0, 3.3, 3.2, 4.0],
                   [3.1, 3.0, 4.0, 2.9, 2.7, 3.9]])

# output - 1
array([3. , 3. , 4. , 2.9, 2.7, 3.9])

# output - 2
array([3. , 2.7])

1. Find the minimum of each of the elements between the first row and the second row.

2. Find the minimum of the elements in each of the rows.

Hint : Use the min function and specify the axis.
code
grades = np.array([[3.0, 3.5, 4.0, 3.3, 3.2, 4.0],
                   [3.1, 3.0, 4.0, 2.9, 2.7, 3.9]])

np.min(grades, axis=0)
np.min(grades, axis=1)

a = np.arange(9).reshape(3,3)

# array - 1
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

# output
array([[-4., -3., -2.],
       [-1.,  0.,  1.],
       [ 2.,  3.,  4.]])

Take the array – 1 above and subtract the mean of all the elements in the array from each of the element.

Hint : Use the np.mean function
code
a = np.arange(9).reshape(3,3)
a - np.mean(a)

a = np.arange(9).reshape(3,3)

# array - 1
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

# output
array([[2, 1, 0],
       [5, 4, 3],
       [8, 7, 6]])

Swap the first column with the last column of a 2-d numpy array

Hint : Recreate the array by specifically mentioning the columns using the indexing operators.
code
a = np.arange(9).reshape(3,3)
a[:,[0,2]] = a[:,[2,0]]

import numpy as np

numbers_1 = np.array([1,3,5,7,9])
numbers_2 = np.array([1,11,13,15,19])

# Get all the common numbers between numbers_1 and numbers_2 into the following variable.
# common  =  

Get all the common numbers between numbers_1 and numbers_2 into the variable common

Use the intersect function
code
import numpy as np

numbers_1 = np.array([1,3,5,7,9])
numbers_2 = np.array([1,11,13,15,19])

common = np.intersect1d(numbers_1, numbers_2)

import numpy as np

numbers = np.array([1,3,5,7,9,11,13,15,17,19,21,23,25])

Get all the numbers in the array between 15 and 20 (inclusive)

Hint : Use the boolean mask
code
import numpy as np

numbers = np.array([1,3,5,7,9,11,13,15,17,19,21,23,25])

numbers[ ( numbers &amp;amp;amp;amp;lt;=20) &amp;amp;amp;amp;amp; (numbers &amp;amp;amp;amp;gt;= 15)]


Create a 2×2 boolean array

code
import numpy as np
bool_array = np.array ([[True, False],
                     [False, True]])

FAQ

What does a negative index do in NumPy Arrays

Say there are 100 numbers. You can fold them into  2d array with reshape(5,20)  . We know that 5 x 20 = 100

What if there are 280 numbers in an array and you want to them into a 2d array of 5 rows. Can you guess the columns ? You would have to do the division 280/5 = 36 . So, you then say reshape(5,36).

Instead of doing the division yourself, you can use the negative index and say reshape(5,-1). Numpy will automatically interpret the second index to be 36.

2 comments

  1. Thanks for the reply ajay, what about a.reshape(5,-2), my jupyter is giving me same answer for all negative integer.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.