This article focuses on using NumPy, which stands for Numerical Python and forms one of the foundations for turning your data into a series of numbers. Machine Learning works out the patterns in those numbers.
GitHub repo
To see the complete code snippet for this tutorial, check this repo.
Prerequisites
To follow through this tutorial, you need knowledge of Python programming language.
Python list
NumPy arrays are similar to Python list, which is the backbone of all data science, machine learning, and numerical computing in Python.
Why NumPy?
- It is fast
- Backbone of other Python scientific packages
- Vectorization via broadcasting (avoiding loops)
- Behind-the-scenes optimizations written in C
Creating arrays
NumPy main datatype is the ndarray
(n-dimensional array). To create an array in NumPy, pass an array using nd.array()
as shown below:
import numpy as np
a1 = np.array([1, 3, 4])
a1 # [1, 3, 4]
The other ways for creating arrays include methods like ones()
, zeros()
, random.random()
, and arrange()
.
ones = np.ones((2, 3))
ones # it creates an array with 2 rows and 3 columns
[[1., 1., 1.],
[1., 1., 1.]]
zeros = np.zeros(3)
zeros # creates a one-dimensional array with zeros
[0., 0., 0.]
range_array = np.arange(0, 10, 2)
range_array # creates an even array from 0 to 10 with a step of 2
[0, 2, 4, 6, 8]
random_array = np.random.random((5, 3))
random_array # creates a two-dimensional array of random numbers with 5 rows and 3 columns
[[0.29388711, 0.63326489, 0.7493256 ],
[0.74146885, 0.21586852, 0.76038582],
[0.21656236, 0.76250936, 0.17387777],
[0.75364549, 0.35779001, 0.66230586],
[0.20382937, 0.04364932, 0.55609439]]
Checking array attributes
To get the array attributes and data types like size, shape, and number of dimensions, we can use the following:
- For the type, use the type() method:
type(a1)
numpy.ndarray
- To check the total number of elements in a NumPy array, use
ndarray.size
attribute:
a1.size
(3,)
- Data type of the array, use:
a1.dtype, random_array.dtype
dtype('int64'), dtype('float64')
- Using
ndarray.shape
shows the size of each dimension:
random_array.shape
(5, 3)
Manipulating and comparing arrays
Different operations on arrays like addition, subtraction, multiplication, division, and many more are possible.
Some notable examples of this include:
# addition
a1 + ones
[[2., 4., 5.],
[2., 4., 5.]]
# subtraction
a1 - ones
[[0., 2., 3.],
[0., 2., 3.]]
Aggregation
Aggregation involves performing the same operation on a number of things.
listy_list = [2, 5, 9]
type(listy_list) # it returns the type as list
Sum the list array using sum()
.
sum(listy_list) # returns the value of 16
or you can do the same thing as above using:
np.sum(a1) # returns a value of 8
Note: Use Python's methods (sum()
) on Python datatypes and use NumPy's methods on NumPy arrays (np.sum()
).
Reshaping and transposing
Reshaping in NumPy means changing the shape of the array from the initial setup.
For instance, we can reshape a 2-dimensional array into a 3-dimensional array using the .reshape()
method like this:
random_array = np.random.rand(5, 3)
random_array
[[0.97861834, 0.79915856, 0.46147936],
[0.78052918, 0.11827443, 0.63992102],
[0.14335329, 0.94466892, 0.52184832],
[0.41466194, 0.26455561, 0.77423369],
[0.45615033, 0.56843395, 0.0187898 ]]
random_array.reshape(3, 5, 1)
The 3 represents the number of dimensions of the array, 5 is the number of rows, and 1 means the number of columns.
[[[0.97861834],
[0.79915856],
[0.46147936],
[0.78052918],
[0.11827443]],
[[0.63992102],
[0.14335329],
[0.94466892],
[0.52184832],
[0.41466194]],
[[0.26455561],
[0.77423369],
[0.45615033],
[0.56843395],
[0.0187898 ]]]
Check the shape
of the random_array
with:
random_array.shape
(3, 5, 1)
Transposing works similarly with reshape
, but you do not need to specify the values this time as it automatically switches the axis.
a2 = np.array([[1, 2.0, 3.3], [4, 5, 8.5]])
a2.shape # (2, 3)
Let's switch the axis to (3, 2)
a2.T
[[1. , 4. ],
[2. , 5. ],
[3.3, 8.5]])
Comparison operators
You can compare NumPy arrays using the greater than, less than, and equal to operators.
a1 > a2
[[False, True, True],
[False, False, False]]
a1 == a1 # True
Reading images in NumPy
With NumPy, you can turn an image into a NumPy array.
Save an image into the same directory as the local file and copy-paste this code:
from matplotlib.image import imread
panda = imread("images/numpy-panda.png")
print(panda.dtype, type(panda))
float32
To see the result, index the result like this:
panda[:5]
If you enjoyed this article, kindly share your thoughts about NumPy.
Top comments (0)