1. An Empirical Study of Training Self-Supervised Vision Transformers
2. Tokenlearner: Adaptive space-time tokenization for videos;michael;NeurIPS,0
3. A simple framework for contrastive learning of visual representations;chen;ICML,0
4. Exploring the limits of transfer learning with a unified text-to-text transformer;raffel;J Mach Learn Research,2020
5. ImageNet: A large-scale hierarchical image database