Bypass network for semantics driven image paragraph captioning
-
Published:2024-12
Issue:
Volume:249
Page:104154
-
ISSN:1077-3142
-
Container-title:Computer Vision and Image Understanding
-
language:en
-
Short-container-title:Computer Vision and Image Understanding
Author:
Zheng QiORCID, Wang Chaoyue, Wang Dadong
Funder
Shenzhen University
Reference48 articles.
1. Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L., 2018. Bottom-up and top-down attention for image captioning and visual question answering. In: CVPR. pp. 6077–6086. 2. Banerjee, S., Lavie, A., 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: ACL Workshop. pp. 65–72. 3. Chatterjee, M., Schwing, A.G., 2018. Diverse and coherent paragraph generation from images. In: ECCV. pp. 729–744. 4. Chen, J., Guo, H., Yi, K., Li, B., Elhoseiny, M., 2022. Visualgpt: Data-efficient adaptation of pretrained language models for image captioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 18030–18040. 5. Cornia, M., Stefanini, M., Baraldi, L., Cucchiara, R., 2020. Meshed-memory transformer for image captioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10578–10587.
|
|