06月14日(Thu) 09:00〜12:20 B会場(-山口県教育会館/第一研修室(141))
演題番号 | 3B1-R-2-1 |
---|---|
題目 | A Distance Between Text Documents based on Topic Models and Ground Metric Learning |
著者 | 金 涛(京都大学情報学研究科知能情報学専攻) Cuturi Marco(京都大学大学院情報学研究科知能情報学専攻) 山本 章博(京都大学 大学院情報学研究科) |
時間 | 06月14日(Thu) 09:00〜09:20 |
概要 | We propose a new distance between text documents that builds upon two techniques. We first represent each document in a database as a histogram of topics using the Latent Dirichlet Allocation (LDA) topic model. We then compare two documents by computing the earth mover's distance between their respective topic histograms. The Earth Mover's Distance parameter, which is in that case a metric matrix between topics, is estimated using Ground Metric Learning. We carry out experiments on different text databases that illustrate the interest of our approach. |
論文 | PDFファイル |