1. Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, and Juan Carlos Niebles. 2015. Activitynet: A large-scale video benchmark for human activity understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 961–970.
2. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning. 1597–1607.
3. Arridhana Ciptadi, Matthew S Goodwin, and James M Rehg. 2014. Movement pattern histogram for action recognition and retrieval. In Proceedings of the European Conference on Computer Vision (ECCV’14). Springer, 695–710.
4. Multiple Temporal Pooling Mechanisms for Weakly Supervised Temporal Action Localization
5. Junyu Gao, Mengyuan Chen, and Changsheng Xu. 2022. Fine-grained temporal contrastive learning for weakly-supervised temporal action localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 19999–20009.