Groupby
Groupby
- Pandas groupby is used to group data into categories and then apply a function to the categories. It also aids in the efficient aggregation of data.
- In pandas, we use groupby() function which splits the data into groups based on some condition.
- Syntax:
DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=NoDefault.no_default, observed=False, dropna=True)
Parameters here:
-
by: mapping, function, str.
-
axis: int, default 0
-
level: If the axis is a MultiIndex, group by a particular level or levels.
-
as_index: For aggregated output, return an object with group labels as the index. Only relevant for DataFrame input. as_index=False is an effective “SQL-style” grouped output. sort:
-
Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. groupby preserves the order of rows within each group.
-
group_keys: When calling apply, add group keys to the index to identify pieces.
-
squeeze: Reduce the dimensionality of the return type if possible, otherwise return a consistent type
-
Returns: GroupBy object.
- Syntax:
-
Example:
import pandas as pd arr = [[11, 12, 13], [41, 76, 34], [23, None, 37], [91, 12, 20]] df = pd.DataFrame(arr, columns=['p','q','r']) print(df) sk = df.groupby('q') #spliting the data based on column 'q' print(sk.first()) print('----------------') print(df.groupby('q').sum()) #printing the sum of other values based on q's values print('----------------')
Output:
p q r 0 11 12.0 13 1 41 76.0 34 2 23 NaN 37 3 91 12.0 20 p r q 12.0 11 13 76.0 41 34 ---------------- p r q 12.0 102 33 76.0 41 34 ----------------
-
Example:
import pandas as pd df = pd.DataFrame({'Avengers': ['Falcon', 'Falcon', 'Iron Man', 'Iron Man'], 'Max Speed': [380., 370., 424., 226.]}) print(df) print('---------------------------') print(df.groupby('Avengers').mean())
Output:
Avengers Max Speed 0 Falcon 380.0 1 Falcon 370.0 2 Iron Man 424.0 3 Iron Man 226.0 --------------------------- Max Speed Avengers Falcon 375.0 Iron Man 325.0