DEV Community

Cover image for NumPy Arrays: An Introduction
oteri
oteri

Posted on

NumPy Arrays: An Introduction

This article focuses on using NumPy, which stands for Numerical Python and forms one of the foundations for turning your data into a series of numbers. Machine Learning works out the patterns in those numbers.

GitHub repo

To see the complete code snippet for this tutorial, check this repo.

Prerequisites

To follow through this tutorial, you need knowledge of Python programming language.

Python list

NumPy arrays are similar to Python list, which is the backbone of all data science, machine learning, and numerical computing in Python.

Why NumPy?

  • It is fast
  • Backbone of other Python scientific packages
  • Vectorization via broadcasting (avoiding loops)
  • Behind-the-scenes optimizations written in C

Creating arrays

NumPy main datatype is the ndarray (n-dimensional array). To create an array in NumPy, pass an array using nd.array() as shown below:

import numpy as np

a1 = np.array([1, 3, 4])
a1 # [1, 3, 4]
Enter fullscreen mode Exit fullscreen mode

The other ways for creating arrays include methods like ones(), zeros(), random.random(), and arrange().

ones = np.ones((2, 3))
ones # it creates an array with 2 rows and 3 columns
Enter fullscreen mode Exit fullscreen mode

[[1., 1., 1.],
[1., 1., 1.]]

zeros = np.zeros(3)
zeros # creates a one-dimensional array with zeros
Enter fullscreen mode Exit fullscreen mode

[0., 0., 0.]

range_array = np.arange(0, 10, 2)
range_array # creates an even array from 0 to 10 with a step of 2
Enter fullscreen mode Exit fullscreen mode

[0, 2, 4, 6, 8]

random_array = np.random.random((5, 3))
random_array # creates a two-dimensional array of random numbers with 5 rows and 3 columns
Enter fullscreen mode Exit fullscreen mode

[[0.29388711, 0.63326489, 0.7493256 ],
[0.74146885, 0.21586852, 0.76038582],
[0.21656236, 0.76250936, 0.17387777],
[0.75364549, 0.35779001, 0.66230586],
[0.20382937, 0.04364932, 0.55609439]]

Checking array attributes

To get the array attributes and data types like size, shape, and number of dimensions, we can use the following:

  • For the type, use the type() method:
type(a1)
Enter fullscreen mode Exit fullscreen mode

numpy.ndarray

  • To check the total number of elements in a NumPy array, use ndarray.size attribute:
a1.size
Enter fullscreen mode Exit fullscreen mode

(3,)

  • Data type of the array, use:
a1.dtype, random_array.dtype
Enter fullscreen mode Exit fullscreen mode

dtype('int64'), dtype('float64')

  • Using ndarray.shape shows the size of each dimension:
random_array.shape
Enter fullscreen mode Exit fullscreen mode

(5, 3)

Manipulating and comparing arrays

Different operations on arrays like addition, subtraction, multiplication, division, and many more are possible.

Some notable examples of this include:

# addition
a1 + ones
Enter fullscreen mode Exit fullscreen mode

[[2., 4., 5.],
[2., 4., 5.]]

# subtraction
a1 - ones
Enter fullscreen mode Exit fullscreen mode

[[0., 2., 3.],
[0., 2., 3.]]

Aggregation

Aggregation involves performing the same operation on a number of things.

listy_list = [2, 5, 9] 
type(listy_list) # it returns the type as list
Enter fullscreen mode Exit fullscreen mode

Sum the list array using sum().

sum(listy_list) # returns the value of 16
Enter fullscreen mode Exit fullscreen mode

or you can do the same thing as above using:

np.sum(a1) # returns a value of 8
Enter fullscreen mode Exit fullscreen mode

Note: Use Python's methods (sum()) on Python datatypes and use NumPy's methods on NumPy arrays (np.sum()).

Reshaping and transposing

Reshaping in NumPy means changing the shape of the array from the initial setup.

For instance, we can reshape a 2-dimensional array into a 3-dimensional array using the .reshape() method like this:

random_array = np.random.rand(5, 3)
random_array
Enter fullscreen mode Exit fullscreen mode

[[0.97861834, 0.79915856, 0.46147936],
[0.78052918, 0.11827443, 0.63992102],
[0.14335329, 0.94466892, 0.52184832],
[0.41466194, 0.26455561, 0.77423369],
[0.45615033, 0.56843395, 0.0187898 ]]

random_array.reshape(3, 5, 1)
Enter fullscreen mode Exit fullscreen mode

The 3 represents the number of dimensions of the array, 5 is the number of rows, and 1 means the number of columns.

[[[0.97861834],
[0.79915856],
[0.46147936],
[0.78052918],
[0.11827443]],

[[0.63992102],
[0.14335329],
[0.94466892],
[0.52184832],
[0.41466194]],

[[0.26455561],
[0.77423369],
[0.45615033],
[0.56843395],
[0.0187898 ]]]

Check the shape of the random_array with:

random_array.shape
Enter fullscreen mode Exit fullscreen mode

(3, 5, 1)

Transposing works similarly with reshape, but you do not need to specify the values this time as it automatically switches the axis.

a2 = np.array([[1, 2.0, 3.3], [4, 5, 8.5]])
a2.shape # (2, 3)
Enter fullscreen mode Exit fullscreen mode

Let's switch the axis to (3, 2)

a2.T
Enter fullscreen mode Exit fullscreen mode

[[1. , 4. ],
[2. , 5. ],
[3.3, 8.5]])

Comparison operators

You can compare NumPy arrays using the greater than, less than, and equal to operators.

a1 > a2
Enter fullscreen mode Exit fullscreen mode

[[False, True, True],
[False, False, False]]

a1 == a1 # True
Enter fullscreen mode Exit fullscreen mode

Reading images in NumPy

With NumPy, you can turn an image into a NumPy array.

panda

Save an image into the same directory as the local file and copy-paste this code:

from matplotlib.image import imread

panda = imread("images/numpy-panda.png")
print(panda.dtype, type(panda))
Enter fullscreen mode Exit fullscreen mode

float32

To see the result, index the result like this:

panda[:5]
Enter fullscreen mode Exit fullscreen mode

Index result of the panda array

If you enjoyed this article, kindly share your thoughts about NumPy.

Resources

Top comments (0)