Author of the publication

BabyTalk: Understanding and Generating Simple Image Descriptions.

, , , , , , , and . IEEE Trans. Pattern Anal. Mach. Intell., 35 (12): 2891-2903 (2013)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Combining Multiple Cues for Visual Madlibs Question Answering., , , , , and . Int. J. Comput. Vis., 127 (1): 38-60 (2019)iWalk: a tool for interacting with geo-located data through movement and gesture., , and . ACM Multimedia, page 1059-1062. ACM, (2010)TREETALK: Composition and Compression of Trees for Image Descriptions., , , and . Trans. Assoc. Comput. Linguistics, (2014)CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval., , , , , , and . CoRR, (2022)BabyTalk: Understanding and Generating Simple Image Descriptions., , , , , , , and . IEEE Trans. Pattern Anal. Mach. Intell., 35 (12): 2891-2903 (2013)End-to-End Visual Editing with a Generatively Pre-trained Artist., , , , and . ECCV (15), volume 13675 of Lecture Notes in Computer Science, page 18-35. Springer, (2022)Hipster Wars: Discovering Elements of Fashion Styles., , , and . ECCV (1), volume 8689 of Lecture Notes in Computer Science, page 472-488. Springer, (2014)Iconizer: A Framework to Identify and Create Effective Representations for Visual Information Encoding., , and . Smart Graphics, volume 6815 of Lecture Notes in Computer Science, page 78-90. Springer, (2011)Multi-Target Embodied Question Answering., , , , , and . CVPR, page 6309-6318. Computer Vision Foundation / IEEE, (2019)TVQA+: Spatio-Temporal Grounding for Video Question Answering., , , and . ACL, page 8211-8225. Association for Computational Linguistics, (2020)