Numpy and the `ndarray`

When working with large amounts of data, the built-in python datastructures begin to show their limitations.

The standard list is designed for heterogenous data, meaning it uses memory less efficiently than something designed to work with homogenous numerical data.
Numpy's array structures are optimized for doing mathematics such as linear algebra
They are the backbone of many SciPy libraries and data structures.

Note: It is strongly advised to consult Numpy's official documentation. The official docs are very well written, and go into more depth than this presentation.

Basic operations on `numpy` arrays and their standard-library counterparts

# uci_bootcamp_2021/examples/numpy_example.py

import numpy as np

# Libraries in the SciPy stack tend to shorthand their module names upon import.
# In an effort to be consistent with existing documentation, I will follow that convention.

# Generating some test data
vanilla_list = [randbelow(2 ** 32) for _ in range(500)]

# Declaring an 1d array from the vanilla list.
array = np.array(vanilla_list)

# Taking the sum via vanilla means
print(sum(vanilla_list))
# Taking the sum via numpy
print(array.sum())

# Taking the subset of values that are even
evens = array[array % 2 == 0]
# and the equivalent pure-python list:
evens_list = [value for value in vanilla_list if value % 2 == 0]  # "list comprehension"

# multiplying all values of the array by a scalar
double_array = array * 2
double_list = [value * 2 for value in vanilla_list]  # "list comprehension"

# Taking the dot product of two arrays:
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
dot = v1.dot(v2)
# alternatively,
dot = v1 @ v2
print(dot)