Instagram
youtube
Facebook
Twitter

Merging and Specific Columns

Merge on Specific Columns

  • In the previous section, the merge() function “knew” how to merge tables based on the columns that were the same between two tables. But what If there is no common column and this is not a universal way to merge there are others way too.
  • Let's check out our datasets for this tutorial.
    import pandas as pd
    
    orders = pd.read_csv('orders.csv')
    products = pd.read_csv('products.csv')
    print(orders)
    print(products)

  • Now if we use merge function we'll get                                                                                                                                   

    #id	  description	       price
    # 1	  thing-a-ma-jig    	5
    # 2	  whatcha-ma-call-it	10
    # 3	  doo-hickey	        7
    # 4	  gizmo	                3

    Here, the id column means different in each table so this is not the correct way.

  • We can use rename() method of pandas to rename columns for merges. So, in our orders dataset, we have a customers_id column, however, it is missing in our customer's table. So, here we are going to rename id column in customers to customers_id which will result in the correct merge.

    customers = customers.rename(columns={'id': 'customer_id'})
    pd.merge(orders,customers)