Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input-Reference-Cited by-同舟云学术

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input

Published:2018 Issue: Volume: Page:659-677
ISSN:0302-9743
Container-title:Computer Vision – ECCV 2018
language:
Short-container-title:

Author:

Harwath David,Recasens Adrià,Surís Dídac,Chuang Galen,Torralba Antonio,Glass James

Publisher

Springer International Publishing

Link

http://link.springer.com/content/pdf/10.1007/978-3-030-01231-1_40

Reference59 articles.

1. Alishahi, A., Barking, M., Chrupala, G.: Encoding of phonology in a recurrent neural model of grounded speech. In: CoNLL (2017)

2. Antol, S., et al.: VQA: visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)

3. Arandjelovic, R., Zisserman, A.: Look, listen, and learn. In: ICCV (2017)

4. Aytar, Y., Vondrick, C., Torralba, A.: SoundNet: learning sound representations from unlabeled video. In: Advances in Neural Information Processing Systems, vol. 29, pp. 892–900 (2016)

5. Bergamo, A., Bazzani, L., Anguelov, D., Torresani, L.: Self-taught object localization with deep networks. CoRR abs/1409.3964 (2014). http://arxiv.org/abs/1409.3964

Cited by 88 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-modal data clustering using deep learning: A systematic review;Neurocomputing;2024-11

2. Occupation Prediction with Multimodal Learning from Tweet Messages and Google Street View Images;AGILE: GIScience Series;2024-05-30

3. Recent Advances in Synthesis and Interaction of Speech, Text, and Vision;Electronics;2024-04-30

4. Speech Guided Masked Image Modeling for Visually Grounded Speech;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

5. Zero-Shot Intent Classification Using a Semantic Similarity Aware Contrastive Loss and Large Language Model;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14