Instagram
youtube
Facebook
Twitter

Python Selecting Training Set Programme Codechef Solution

Problem

You are given a dataset consisting of N items. Each item is a pair of a word and a boolean denoting whether the given word is a spam word or not.

We want to use this dataset for training our latest machine learning model. Thus, we want to choose some subset of this dataset as a training dataset. We want to make sure that there are no contradictions in our training set, i.e., there shouldn't be a word included in the training set that's marked both as spam and not-spam. For example, items {"fck", 1}, and "fck, 0"} can't be present in the training set because the first item says the word "fck" is spam, whereas the second item says it is not, which is a contradiction.

Your task is to select the maximum number of items in the training set.

Note that the same pair of {word, and bool} can appear multiple times in input. The training set can also contain the same pair multiple times.

Input

  • The first line will contain T, the number of test cases. Then the test cases follow.

  • The first line of each test case contains a single integer, N.

  • N-lines follow. For each valid i, the i-th of these lines contains a string wi, followed by a space, and an integer (boolean) si, denoting the i-th item.

Output

For each test case, output an integer corresponding to the maximum number of items that can be included in the training set in a single line.

Constraints

  • 1<=T<=10

  • 1<=N<=25, 000

  • 1=|wi| <=5 for each valid i

  • 1=si <=5 for each valid i

  • W1, w2,..., and wN contain only lowercase English letters.

Sample Input:

3
3
abc 0
abc 1
efg 1
7
fck 1
fck 0
fck 1
body 0
body 0
body 0
ram 0
5
vv 1
vv 0
vv 0
vv 1
vv 1

Sample Output:

2
6
3


Explanation

Example case 1: You can include either of the first and second items, but not both. The third item can also be taken. This way, the training set can contain at most 2 items.

Example Case 2: You can include all the items except the second item in the training set.

Solution:

try:
    t=int(input(“Enter number of terms: “))
    for i in range(t):
        n=int(input(“Enter a number: “))
        d={}
        for j in range(n):
            a,b=map(str,input(“Enter a string and a bool value i.e.,0 and 1: “).split(" "))
            b=int(b)
            if a not in d:
                d[a]=[0,0]
            d[a][b]+=1
        sums=0
        for j in d:
            sums=sums+max(d[j])
        print(sums)
except:
    pass

Steps to solve this problem:

  • In the try block, ask the user to enter a number of terms and store them in

  • In the loop, ask the user to enter a number and store it in

  • Create an empty dictionary.

  • In a nested loop, ask the user to enter multiple string and bool values, and using the map() function, get iterator objects and store them in variables a and b.

  • Convert b into an int type using the int() function.

  • Now check if a is not present in the dictionary d, update the dictionary by value [0,0], or move one step ahead.

  • Initialise a variable sum with 0.

  • In the loop, add sums with dictionary d.

  • Print the value of sums.

  • In the except block, we just left it empty, so we used the pass statement in it.