Random Sampling
Random sampling is a simple but powerful technique often used in various fields like statistics, data science, and research. In this tutorial, we'll explain what random sampling is and how to use it with Python.
What is Random Sampling?
-
Random sampling is like picking a few candies from a big jar without looking. It is random and fair.
-
In data, it means choosing a random subset from a bigger collection, which can help us make generalizations about the whole collection.
Why Do We Use Random Sampling?
-
Random sampling helps us study a small part of a big group to understand and make predictions about the entire group.
-
It's useful in surveys, experiments, and when we have too much data to work with all at once.
Random Sampling in Python:
-
Python has tools to make random sampling easy.
-
We'll use the `random` module for basic sampling and NumPy for more advanced options.
Basic Random Sampling with Python:
-
We can use Python's built-in `random` module to do simple random sampling.
-
Here's how to randomly pick a number between 1 and 10:
Code:
import random
random_number = random.randint(1, 10)
print("Random Number:", random_number)
Output:
Random Number: 9
Random Sampling with NumPy:
-
NumPy is handy for more complex sampling tasks.
-
Let's say we want to pick 3 random elements from a list:
Code:
import numpy as np
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
sample = np.random.choice(data, 3, replace=False)
print("Random Sample:", sample)
In this code, we used NumPy to select three unique values from the list without repeatitions.
Output:
Random Sample: [80 10 90]
Generating random array
-
As we know that NumPy works with arrays so we will have to learn how to generate random arrays using this random module in python.
-
Generating random integer-based array using randint() method which needs size parameter to specify the size of the array.
from numpy import random
x=random.randint(100, size=(6))
print(x) # [24 22 19 63 0 26]
Similarly, we can generate 2-D arrays by mentioning the number of rows and how many elements in a row:
from numpy import random
x = random.randint(100, size=(4, 6))
print(x)
#[[86 47 69 65 78 14]
# [93 93 92 73 42 87]
# [47 10 3 68 60 57]
# [52 14 47 34 87 95]]
Now we can also use the function rand() to generate arrays with floats:
from numpy import random
x = random.rand(5) #1-D array
y = random.rand(3, 5) #2-D array
print(x)
print('--------------------')
print(y)
Output:
[0.28352016 0.07261339 0.08131493 0.49114699 0.3350447 ]
--------------------
[[0.73124232 0.5422545 0.32198949 0.60167792 0.31638704]
[0.27968805 0.52963824 0.34528043 0.37841453 0.01423927]
[0.83478357 0.57090813 0.76042533 0.04061672 0.04093552]]