Multimodal Lexical Translation Dataset:
A dataset of ambiguous words and its lexical translations together with visual and textual contexts (i.e. an image and sentence respectively)
If you use this dataset, please cite: http://www.lrec-conf.org/proceedings/lrec2018/summaries/629.html
English_Word | Lexical_Translation | Textual_Context (A sentence) | Visual_Context (Image id)
Please download the images from http://www.quest.dcs.shef.ac.uk/wmt17_files_mmt/images_flickr.task1.tar.gz
The human annotations of 2018 test set is saved in files human.de
and human.fr
These are in the same format as above with an extra column where human annotators had indicated that image were used.
English_Word | Lexical_Translations | Textual_Context (A sentence) | Visual_Context (Image id) | Was Image used? (yes/no)