Data Science

Getting Started With NumPy: The Best Tutorial for Beginners

Last Updated: 12th January, 2024

Harshini Bhat

Data Science Consultant at almaBetter

A powerful library that helps perform numerical operations, linear algebra, and array manipulation in Python. Discover the fundamentals of NumPy and learn basics

Are you tired of numerical computations and struggling with data manipulation in Python? Meet NumPy - a powerful library that helps perform numerical operations, linear algebra, and array manipulation in Python in an efficient manner.

NumPy is a must-have tool for anyone looking forward to or working with data in Python, from Data scientists and Data engineers to machine learning enthusiasts. We will explore the basics of NumPy and get started with this powerful library.

Everything from installing NumPy to creating arrays and performing mathematical operations will be covered in this article. Whether you are an experienced Python developer or just starting out, get ready to upgrade your numerical computing skills with NumPy!

What is NumPy?

NumPy (short for Numerical Python) is a powerful Python library that is open-source and utilized in practically every discipline of research and engineering. It's the universal Python standard for working with numerical data, and it's at the heart of the scientific Python and PyData ecosystems. NumPy users range from novice coders to expert researchers conducting cutting-edge scientific and corporate research and development. Pandas, SciPy, Matplotlib, scikit-learn, scikit-image, and most other data science and scientific Python packages make substantial use of the NumPy API.

The NumPy library includes multidimensional array and matrix data structures (more on this in later parts). It provides ways for efficiently operating on ndarray, a homogenous n-dimensional array object. NumPy can be used to conduct a wide range of tasks.

NumPy may be used to conduct a wide range of array-based mathematical operations. It extends Python with sophisticated data structures that provide efficient calculations with arrays and matrices, as well as a vast library of high-level mathematical functions that operate on these arrays and matrices.

Why is NumPy Useful?

NumPy provides a host of features that make it a versatile and indispensable tool for scientific computing and data analysis. It allows you to work with large, multi-dimensional arrays and perform complex mathematical operations on them efficiently. NumPy arrays are faster and smaller than Python lists. An array uses less memory and is easier to utilize. NumPy stores data in substantially less RAM and has a mechanism for specifying data types. This enables even further optimization of the code. NumPy's extensive library of functions and tools makes it easy to perform linear algebra, statistical analysis, and data manipulation tasks. Plus, NumPy integrates seamlessly with other Python libraries, making it a valuable asset in a wide range of applications.

What types of problems can NumPy solve?

NumPy is designed to solve a wide range of numerical problems, such as:

Manipulating and processing large amounts of data efficiently
Performing complex mathematical operations, including linear algebra and Fourier transforms
Generating random numbers and statistical analysis
Plotting and visualizing data
Machine learning and data mining

NumPy is an incredibly useful library that help to solve a wide range of numerical problems efficiently and effectively.

Prerequisites

One should be familiar with Python. See the Python tutorial for a refresher.

Matplotlib, as well as NumPy, are required to run the examples. We will go through the installation in this article

Objectives and Outcomes

An overview of arrays in NumPy. How to represent and handle n-dimensional () arrays.
With NumPy, we can distinguish between one-, two-, and n-dimensional arrays.
Learn how to use for-loops to perform some linear algebra calculations on n-dimensional arrays.

Installing NumPy

We strongly advise using a scientific Python distribution to install NumPy. See Installing NumPy for complete instructions on installing NumPy on your operating system.

If you currently have Python installed, you may install NumPy using:

Loading...

Loading...

If you don't already have Python, you might want to look into Anaconda. It's the most straightforward method to get started. The advantage of acquiring this distribution is that we won't have to worry about separately installing NumPy or any of the other key programs we will be utilized for data analysis, such as pandas, Scikit-Learn, and so on.

Read our latest guide on "How to Install Anaconda in Windows"

How to import NumPy?

We can Import NumPy and its functions into our Python code as follows:

Loading...

We shorten the imported name to np to improve code readability while using NumPy. This is a well-accepted convention that you should adhere to so that anyone working with your code may understand it quickly.

The Basics

What is an array?

The NumPy library's central data structure is an array. An array is a grid of numbers that provides information about the raw data and how to locate and interpret elements. It has an element grid that may be indexed in numerous ways. The array data type refers to the fact that all of the items are of the same type.

A tuple of nonnegative integers, booleans, another array, or integers can be used to index an array. The number of dimensions is represented by the array's rank. The array's form is a tuple of integers indicating the array's size along each dimension.

For example:

Loading...

or:

Loading...

With square brackets, we can access the array's elements. When accessing items, keep in mind that NumPy indexing begins at 0. That is, if we wish to access the first element in our array, we will use element "0."

An array is sometimes referred to as a "ndarray," which is shorthand for "N-dimensional array." An N-dimensional array is simply any number of dimensions in an array. We may also come across terms like 1-D, or one-dimensional array, 2-D, or two-dimensional array, and so on. Matrixes and vectors are both represented by the NumPy ndarray class. A vector is a one-dimensional array (there is no distinction between a row and column vectors), whereas a matrix is a two-dimensional array. Tensor is another word for three-dimensional or higher-dimensional arrays.

What are the attributes of an array?

Let us see the attributes with the help of Python code examples. The attributes of an array are as follows

Size: The size of an array considered is the number of elements it can store. It is typically specified at the time of array creation and is fixed throughout the lifetime of the array.

Loading...

Type: The type of an array is the data type of its elements. All the elements that are there in an array must be of the same data type.

Loading...

Indexing: Arrays are indexed using integer values that represent the position of an element in the array. The index values start from zero and go up till the size of the array minus one.

Loading...

Contiguous memory allocation: Arrays store their elements in contiguous memory locations. This means that the elements of an array occupy a block of memory that is allocated in a contiguous manner.

Loading...

Homogeneity: All the elements of an array must be of the same data type. This means that an array cannot store elements of different data types.

Loading...

Fixed size: The size of an array is fixed and cannot be changed once it is created. To add or remove elements, a new array must be created and the elements must be copied over. Using the append method, actually creates a new array with the added element.

Loading...

Efficiency: Arrays offer efficient access to elements because they use constant-time indexing. This means that accessing any element of an array takes the same amount of time, regardless of the size of the array.

Loading...

Basic Operations

Arrays support a variety of basic operations. Here are some of the most common ones with examples in Python:

1. Creating an array: Arrays can be created using square brackets notation in Python:

Loading...

2. Accessing an element: Elements in an array can be accessed using their index:

Loading...

3. Updating an element: Elements in an array can be updated using their index:

Loading...

4. Adding an element: Elements can be added to the end of an array using the append method:

Loading...

5. Removing an element: Elements can be removed from an array using the remove method

Loading...

6. Sorting an array: Arrays can be sorted using the sorted function:

Loading...

7. Reversing an array: Arrays can be reversed using the reverse method

Loading...

8. Finding the length of an array: The length of an array can be found using the len function:

Loading...

9. Iterating over an array: Arrays can be iterated over using a for loop:

Loading...

10. Arithmetic operators can be applied to arrays to perform element-wise operations. Here are some examples in Python:

Addition/ Subtraction:

Loading...

Output→ array([0, 1, 2, 3, 4])

Loading...

Output →array([10, 19, 28, 37, 46])

Multiplication/Division

Loading...

Exponentiation: To raise an array to a power, the ** operator can be used:

Loading...

Output: [1, 4, 9]

Universal Functions

In NumPy, a universal function (ufunc) is a function that performs element-wise operations on ndarrays. Universal functions are important in NumPy because they allow you to perform fast and vectorized operations on large arrays, without the need for Python loops.

Here are some examples of universal functions in NumPy:

np.add(a,b) - computes the element-wise sum of two arrays
np.subtract(a,b) - computes the element-wise difference of two arrays
np.multiply(a,b) - computes the element-wise product of two arrays
np.divide(a,b) - computes the element-wise division of two arrays
np.exp(a) - computes the element-wise exponential of an array
np.sqrt(a) - computes the element-wise square root of an array
np.sin(a) - computes the element-wise sine of an array
np.cos(a) - computes the element-wise cosine of an array
np.tan(a) - computes the element-wise tangent of an array
np.maximum(a,b) - computes the element-wise maximum of two arrays
np.minimum(a,b) - computes the element-wise minimum of two arrays

All of these functions operate element-wise on the input arrays, which means that they perform the same operation on each element of the array. For example, if we have two arrays x and y, np.add(x, y) will compute the sum of the first element of x and the first element of y, the sum of the second element of x and the second element of y, and so on.

NumPy also provides many other universal functions, and we can even create our own custom ufuncs using the np.frompyfunc() or np.vectorize() functions.

Broadcasting

There may be occasions when we need to perform an operation between an array and a single integer (also known as an operation between a vector and a scalar) or between arrays of different sizes. For example, our array (which we'll refer to as "data") may contain information about distance in miles that you want to convert to kilometers.

This operation can be carried out with python as follows:

Loading...

Conclusion

Learning NumPy is a must for anyone interested in scientific computing or data analysis with Python. NumPy's efficient arrays and sophisticated array operations allow you to swiftly and simply manipulate enormous volumes of data. By learning the fundamentals of NumPy, we will be able to use the various tools and libraries that are built on top of it, such as Pandas, Matplotlib, and SciPy.

While the syntax of NumPy may appear difficult at first, it is well worth the time and effort to become acquainted with the library, as it will save you many hours of work in the long run. Thus, if one has not started already, plunge into NumPy and unleash the full power of scientific computing.

NumPy recognizes that multiplication should occur with each cell. This is known as broadcasting. Broadcasting is a method that allows NumPy to operate on arrays of various shapes. Your array's dimensions must be compatible, for example, when the size of both arrays is equal, or one of them is 1. If the dimensions are incompatible, a ValueError will be returned.

If you have a keen interest in learning the key aspects of Data Science, sign up for AlmaBetter’s Full Stack Data Science program to become a coveted Data Science and Analytics professional.

Stay tuned to our blog page for more interesting blogs.

Read our recent blog on “Mastering Machine Learning in 2023:Top 10 Libraries to Keep Your Eye On”.