Research Article

Exploiting Contextual Word Embedding of Authorship and Title of Articles for Discovering Citation Intent Classification

Table 4

Overview of word embedding techniques with their strengths and weaknesses.

#AlgorithmStrengthsWeaknessesType of word embedding

1TF-IDF [29](1) Vectors based on the occurrence of a word within a corpus and in the document are counted
(2) Vector is proportional to the count of a word in a document and inverse to its count in other documents
(3) Reducing the importance of common words frequently occurring, e.g., “while,” “but,” “the,” and “is”
(4) Computing similarity is easy
(1) The similarity is merely based on the frequency of the words neglecting the semantic similarity
(2) The size of a vector is large
(3) Co-occurrence of words in a document is not recorded
(4) Vectors are sparse
(5) Synonyms are not considered
(6) Polysemy words have a single vector. For example, apple is a fruit and Apple is a company; both have the same vector representation
Count based

2Global Vectors (GloVe) [30], co-occurrence matrix [29](1) It is a hybrid method using a statistical matrix with machine learning
(2) Records the appearance of a set of words in a corpus
(3) Semantic similarity between King and Queen
(4) Dimensionality reduction reduces the dimensions while producing more accurate vectors
(1) Costly in terms of memory, for recording co-occurrences of wordsCount based

3Word2Vec [31](1) Word analogies and word similarities are stimulated
(2) Measures likelihoods of wordsxxxx
(3) “King-man + woman = Queen,” which is a great feature of word embedding
(4) Vectors can infer “king: man as queen: woman”
(5) Input words mapped to target words
(6) Probabilistic methods generally perform superior to deterministic methods [32]
(7) Comparatively, small memory is consumed
(1) Training becomes difficult with the large size of the vocabulary
(2) Polysemy words have an aggregated vector representation provided in CBOW, whereas in Skip-gram, they keep separate vectors
Prediction based

4ELMO [33], Infersent [34], BERT [33]Positioning embedding is incorporated, creating different vectors for the same word depending upon the position and context in a sentence/paragraphContextualized embeddings require lots of computationPrediction based