Semantic Extraction of Basketball Game Video Combining Domain Knowledge and In-Depth Features

<div>Our network architecture. (We use a standard CNN architecture (VGG-16) to extract features from sampled appearance and motion frames in the video. These features are then brought together across space and time using a pooling layer of this paper’s aggregation layer, which can be trained end-to-end and has a classification loss.)</div>

Scientific Programming

fig2

Figure 2

Figure 2: Semantic Extraction of Basketball Game Video Combining Domain Knowledge and In-Depth Features