05月23日(Tue) 13:50〜15:30 A会場(ウインクあいち-2F 大ホール)
演題番号 | 1A1-OS-05a-3 |
---|---|
題目 | 画像とテキストの潜在的な意味情報を用いたニューラル翻訳モデルの提案 |
著者 | 冨山 翔司(東京大学工学系研究科技術経営戦略学専攻) 味曽野 雅史(東京大学情報理工学系研究科システム情報学専攻) 鈴木 雅大(東京大学 工学系研究科 技術経営戦略学専攻) 中山 浩太郎(東京大学工学系研究科技術経営戦略学専攻) 松尾 豊(東京大学工学系研究科技術経営戦略学専攻) |
時間 | 05月23日(Tue) 14:30〜14:50 |
概要 | Although attention-based Neural Machine Translation have achieved great success, attention-mechanism cannot capture the entire meaning of the source sentence because the attention mechanism generates a target word depending heavily on the relevant parts of the source sentence. The report of earlier studies has introduced a latent variable to capture the entire meaning of sentence and achieved improvement on attention-based Neural Machine Translation. We follow this approach and we believe that the capturing meaning of sentence benefits from image information because human beings understand the meaning of language not only from textual information but also from perceptual information such as that gained from vision. As described herein, we propose a neural machine translation model that introduces a continuous latent variable containing an underlying semantic extracted from texts and images. Experiments conducted with an English–German translation task show that our model outperforms over the baseline in METEOR score. |
論文 | PDFファイル |