Complexity

Research Article

Exploiting Contextual Word Embedding of Authorship and Title of Articles for Discovering Citation Intent Classification

Table 4

Overview of word embedding techniques with their strengths and weaknesses.


#	Algorithm	Strengths	Weaknesses	Type of word embedding

1	TF-IDF [29]	(1) Vectors based on the occurrence of a word within a corpus and in the document are counted (2) Vector is proportional to the count of a word in a document and inverse to its count in other documents (3) Reducing the importance of common words frequently occurring, e.g., “while,” “but,” “the,” and “is” (4) Computing similarity is easy	(1) The similarity is merely based on the frequency of the words neglecting the semantic similarity (2) The size of a vector is large (3) Co-occurrence of words in a document is not recorded (4) Vectors are sparse (5) Synonyms are not considered (6) Polysemy words have a single vector. For example, apple is a fruit and Apple is a company; both have the same vector representation	Count based

2	Global Vectors (GloVe) [30], co-occurrence matrix [29]	(1) It is a hybrid method using a statistical matrix with machine learning (2) Records the appearance of a set of words in a corpus (3) Semantic similarity between King and Queen (4) Dimensionality reduction reduces the dimensions while producing more accurate vectors	(1) Costly in terms of memory, for recording co-occurrences of words	Count based

3	Word2Vec [31]	(1) Word analogies and word similarities are stimulated (2) Measures likelihoods of wordsxxxx (3) “King-man + woman = Queen,” which is a great feature of word embedding (4) Vectors can infer “king: man as queen: woman” (5) Input words mapped to target words (6) Probabilistic methods generally perform superior to deterministic methods [32] (7) Comparatively, small memory is consumed	(1) Training becomes difficult with the large size of the vocabulary (2) Polysemy words have an aggregated vector representation provided in CBOW, whereas in Skip-gram, they keep separate vectors	Prediction based

4	ELMO [33], Infersent [34], BERT [33]	Positioning embedding is incorporated, creating different vectors for the same word depending upon the position and context in a sentence/paragraph	Contextualized embeddings require lots of computation	Prediction based