/ プログラム/ 発表一覧/ 著者一覧/ 企業展示一覧/ jsai2013ホーム /

4B1-3 A Corpus for Studies on Scientific Writing Assistance


Tweet #jsai2013 このエントリーをはてなブックマークに追加

06月07日(Fri) 09:00〜11:20 B会場(-国際会議場201号室)
4B1 自然言語処理・情報検索「自然言語-6」

題目A Corpus for Studies on Scientific Writing Assistance
著者Nguyen Ngan(National Institute of Informatics)
宮尾 祐介(国立情報学研究所コンテンツ科学研究系)
時間06月07日(Fri) 09:40〜10:00
概要Along with the increasing number of non-native speakers of English, the demand for writing assistance applications, including the automatic proofreading application for advanced learners, is raising new challenges for natural language processing (NLP). Previous research on writing assistance has mostly focused on correcting spelling errors and grammatical errors. However, the proofreading process, which is required by the advanced learners, is not only to correct grammatical errors, but also to paraphrase a sentence, when necessary, to make it become more fluent and less awkward. To satisfy such requirements, this work aims at constructing a corpus to support research on writing assistance techniques for advanced English learners. Our corpus is a collection of written work of non-native researchers which has been proofread by a English native speakers. A new annotation scheme was then used to capture both the spelling/grammatical error corrections and the paraphrases made by English native proofreaders. The resulting corpus contains 3485 pairs of original and revised sentences, of which, 2516 pairs contain grammatical and/or paraphrase corrections.