Author:
Harwath David,Recasens Adrià,Surís Dídac,Chuang Galen,Torralba Antonio,Glass James
Publisher
Springer International Publishing
Reference59 articles.
1. Alishahi, A., Barking, M., Chrupala, G.: Encoding of phonology in a recurrent neural model of grounded speech. In: CoNLL (2017)
2. Antol, S., et al.: VQA: visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
3. Arandjelovic, R., Zisserman, A.: Look, listen, and learn. In: ICCV (2017)
4. Aytar, Y., Vondrick, C., Torralba, A.: SoundNet: learning sound representations from unlabeled video. In: Advances in Neural Information Processing Systems, vol. 29, pp. 892–900 (2016)
5. Bergamo, A., Bazzani, L., Anguelov, D., Torresani, L.: Self-taught object localization with deep networks. CoRR abs/1409.3964 (2014). http://arxiv.org/abs/1409.3964
Cited by
88 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献