演題番号 | 1A4-4 |
---|---|
題目 | Dually Extract Semantic from the Web |
著者 | 李 海博(University of Tokyo Department of Creative Informatics) 松尾 豊(東京大学) 石塚 満(東京大学 大学院情報理工学系研究科) |
時間 | 06月09日(Wed) 17:10〜17:30 |
概要 | Traditional relation extraction requires pre-defined relations and many human annotated training data. Meanwhile, open relation extraction demands a set of heuristic rules to extract all potential relations from text. These requirements reduce the practicability and robustness of information extraction system. In this paper, we propose a bootstrapping framework, which uses a few seed sentences marked up with two entities to expand a ranked list of sentences containing target relations. During the expansion process, label propagation algorithm is used to select the most confident entity pairs and context patterns. In order to rank these extracted sentences according their relevance to the given seeds, we propose Multi-View Ranking algorithm. The algorithm is a semi-supervised multi-view learning algorithm which combine information from both entity pair view and context pattern view to rank the sentences. |
論文 | PDFファイル |