/ プログラム/ 発表一覧/ 著者一覧/ 企業展示一覧/ jsai2011ホーム /

3B1-OS22b-1 Spoken Interface for Correcting Phoneme Recognition Errors in Learning of Unknown Words

06月03日(Fri) 09:00〜11:55 B会場(150名-研修室812)
3B1-OS22b オーガナイズドセッション「OS-22 記号創発ロボティクスとマルチモーダルセマンティックインタラクション (2)」

演題番号3B1-OS22b-1
題目Spoken Interface for Correcting Phoneme Recognition Errors in Learning of Unknown Words
著者左 祥(京都工芸繊維大学 情報工学)
住井 泰介(京都工芸繊維大学)
岩橋 直人((独)情報通信研究機構 )
中野 幹生((株)ホンダ・リサーチ・インスティチュート・ジャパン)
船越 孝太郎((株)ホンダ・リサーチ・インスティチュート・ジャパン)
岡 夏樹(京都工芸繊維大学 大学院工芸科学研究科 情報工学部門)
時間06月03日(Fri) 09:00〜09:20
概要This paper presents a novel method for learning of phoneme sequence for out-of-vocabulary (OOV) words. In the method, a user can correct mis-recognized phoneme sequence of an OOV word by making corrective utterances repeatedly. The originalities of this method are: 1) the correction is run in an interactive way, rather than in a batch way, which makes the correction more efficient and, 2) the correction is based on the open-begin-end dynamic programming matching (OBE-DPM) and generalized posterior probability (GPP), which enables a user to use a word segment in a corrective utterance. Comparative experimental results with a maximum likelihood based baseline method which is run in a batch processing showed that the proposed method achieved 96.8% and 79.1% in phoneme and word accuracies for learning new words, with less than seven corrective utterances, while the baseline method achieved only 87.7% and 31.8%. We also found that by using the proposed method, the correct phoneme sequences can be obtained within two corrective utterances for the most words in the experiments.
論文PDFファイル