Research Article

[Retracted] Relieving the Incompatibility of Network Representation and Classification for Long-Tailed Data Distribution

Figure 2

Framework overview of the proposed method. Here, training datasets are split into three subsets and three experts are used as teachers. Each expert is responsible for transferring knowledge from its corresponding subset into a student model. The knowledge is transferred between feature maps and only channels with high activation intensity, which we consider as containing more knowledge, will be used for distillation. Details about filtering channels are introduced in Section 4.3.