Instagram
youtube
Facebook
Twitter

Standard Deviation

Standard deviation is a fundamental statistical concept used to measure the amount of variation or dispersion in a dataset. In this tutorial we'll learn about the basics of calculating standard deviation using NumPy.

Standard Deviation

  • While the mean and median can tell us about the center of our data, they do not reflect the range of the data. That’s where standard deviation comes in.

  • The standard deviation, like the interquartile range, indicates the spread of the data. The greater the standard deviation, the further our data is from the center. The lower the standard deviation, the closer the data is to the mean.

  • We can find the standard deviation of a dataset using the Numpy function np.std()

Syntax

The basic syntax of np.std() is:

numpy.std(a, axis=None, dtype=None, ddof=0)
  • a: The input array.

  • axis: (Optional) The axis along which to calculate the standard deviation.

  • dtype: (Optional) Data type used in computing the standard deviation.

  • ddof: (Optional) Delta degrees of freedom. Default is 0 for population standard deviation, set to 1 for sample standard deviation.

Example:

import numpy as np

arr1 = np.array([68, 1820, 1420, 2062, 704, 1156, 1857, 1755, 2092, 1384])

arr2 = np.array([20, 43, 99, 200, 12, 250, 58, 120, 230, 215])

arr1_avg = np.mean(arr1)
arr2_avg = np.mean(arr2)
arr1_std = np.std(arr1)
arr2_std = np.std(arr2)

print(arr1_avg)
print(arr2_avg)
print(arr1_std)
print(arr2_std)

Output:

1431.8
124.7
611.3183785884406
87.22505374031019