Research Article
Channel-Wise Spatiotemporal Aggregation Technology for Face Video Forensics
| Require: Training face video clips ; Corresponding label . | | 1: for each do | | 2: Decompose into the sequence of frames ; | | 3: Detect and crop faces frames from , then denote them as ; | | 4: end for | | 5: Feed into the backbone, producing a set of feature maps ; | | 6: Decompose into ; | | 7: Combine by going through and stacking that has the equal , producing ; | | 8: Feed into weights-sharing classifier, producing ; | | 9: Calculating binary classification error between and ; | | 10: Update the parameters of the model by back propagation; | | Ensure: Optimal model for fake face detection |
|