| | Input: Feature set |
| | Output: Optimal set of features |
| | (1) For feature xi in x1, x2, …, xn |
| | a. Read feature xi into the array named X |
| | X = {x1, x2, x3,…xi} |
| | b. Read the target variable into array named Y |
| | c. Set the train–test split ratio |
| | Train_r = 0.8 |
| | Test_r = 0.2 |
| | d. Fix the initial seed for random generator in train and test |
| | random_state = n |
| | e. Split the data set into x_train, y_train, y_train and y_test using train–test split ratio and random_state |
| | f. Train and the classifier for feature xi and target |
| | g. Compute accuracy using |
| | Accuracy = (TP + TN)/(TP + FP + TN + FN) |
| | h. Precision is computing by |
| | Precision = TP/(TP + FP) |
| | i. Calculate Recall rate using |
| | Recall = TP/(TP + FN) |
| | j. Find F-score using |
| | F-score = 2 ∗ (Recall ∗ Precision)/(Recall + Precision) |
| | k. Repeat steps c through j by setting different values of train test split ratio |
| | (2) Combine features into categories C1 (Linguistic features (L)), C2 (Sentiment features (S)) and C3 (Contradictory features (C)) |
| | a. For feature category Ci in C1, C2 and C3 |
| | b. Repeat steps a through k |
| | (3) Combine categories of features (L + S), (S + C), (L + C) and (L + S + C) |
| | (4) For each category combination Ci in (L + S), (S + C), (L + C) and (L + S + C) |
| | a. Repeat steps a through k |
| | (5) Repeat steps 1 through 4 for different types of classifier |
| | (6) Select the feature, category, or category combination that gives high accuracy |