Computational Intelligence and Neuroscience

Research Article

Multi-Rule Based Ensemble Feature Selection Model for Sarcasm Type Detection in Twitter

Proposed ensemble feature selection.

	Input: Feature set
	Output: Optimal set of features
	(1) For feature x_i in x₁, x₂, …, x_n
	a. Read feature x_i into the array named X
	X = {x₁, x₂, x_3,…x_i}
	b. Read the target variable into array named Y
	c. Set the train–test split ratio
	Train_r = 0.8
	Test_r = 0.2
	d. Fix the initial seed for random generator in train and test
	random_state = n
	e. Split the data set into x_train, y_train, y_train and y_test using train–test split ratio and random_state
	f. Train and the classifier for feature x_i and target
	g. Compute accuracy using
	Accuracy = (TP + TN)/(TP + FP + TN + FN)
	h. Precision is computing by
	Precision = TP/(TP + FP)
	i. Calculate Recall rate using
	Recall = TP/(TP + FN)
	j. Find F-score using
	F-score = 2 ∗ (Recall ∗ Precision)/(Recall + Precision)
	k. Repeat steps c through j by setting different values of train test split ratio
	(2) Combine features into categories C₁ (Linguistic features (L)), C₂ (Sentiment features (S)) and C₃ (Contradictory features (C))
	a. For feature category C_i in C₁, C₂ and C₃
	b. Repeat steps a through k
	(3) Combine categories of features (L + S), (S + C), (L + C) and (L + S + C)
	(4) For each category combination C_i in (L + S), (S + C), (L + C) and (L + S + C)
	a. Repeat steps a through k
	(5) Repeat steps 1 through 4 for different types of classifier
	(6) Select the feature, category, or category combination that gives high accuracy