Left-Right and Outer Merge
Outer Merge
- Till now, we have been dealing with inner merge which, only includes matching rows. If rows don't match perfectly, we lose the unmatched rows.
- Let's check our dataset first
import pandas as pd store_a = pd.read_csv('store_a.csv') store_b = pd.read_csv('store_b.csv') print(store_a.head()) print(store_b.head())
-
Now, if we want to use an outer join which would include all rows from both the tables, and if they don't match it will be filled with None or nan (not a number). We'll use the parameter 'how' and passed 'outer' as a value.
outer = pd.merge(store_a, store_b, how='outer') print(outer.head())
Left Merge
- Left Merge includes all rows from the first (left) table, but only rows from the second (right) table that match the first table.
- It matters how the arguments are presented. Only rows that occur in Table A will be returned if the initial DataFrame is Table A and a left join is performed.
- By placing Table A first, we receive all of Table A's rows and only those clients of Table B who are also Rows of Table A.
store_a_b_left = pd.merge(store_a, store_b, how = 'left')
store_b_a_left = pd.merge(store_b, store_a, how = 'left')
Here, these two tables will be different as store_a has some rows which is not present in store_b and vice-versa.
Right Merge
- Right merge is the opposite of left merge.
- Here, the merged table will include all rows from the second (right) table, but only rows from the first (left) table that match the second table.
store_a_b_right = pd.merge(store_a, store_b, how = 'right') store_b_a_right = pd.merge(store_b, store_a, how = 'right')