Обзор методов понимания речи и текста
Аннотация
Предлагается краткий обзор существующих методов понимания речи и текста на основе анализа публикаций в научных журналах и трудах ведущих конференций по речевой тематике. Выделяются два различных подхода к проблеме понимания языка: 1) на основе исчисления высказываний и 2) на основе распознавания смысла (речевого намерения). Приводятся доводы в пользу интегральной парадигмы обработки речи, разрабатываемой СПИИРАН по сравнению с широко известными подходами.Литература
Косарев Ю. А. Естественная форма диалога с ЭВМ. — Л: «Машиностроение», 1989.
Austin J. L. How to do things with words. — New York :Oxford University Press // 1973.
Chomsky N. On certain formal properties of grammars. — Inform. Control 2, 1959. — pp. 137– 167.
Городецкий Б. Ю. Компьютерная лингвистика: Моделирование языкового общения // Новое в зарубежной лингвистике. Вып. 24. — М.: Прогресс, 1989. — 5–32 с.
Pfeifer R., Scheier Ch. Understanding Intelligence. — Cambridge, MA: The MIT Press. Bradford Books, 1999.
Налимов В. В. Непрерывность против дискретности в языке и мышлении. — Тбилиси: Изд. Тбилисского университета, 1978.
Косарев Ю. А. Кодовая модель речевого сообщения. — Препринт ЛИИАН.Л., 1986. — 23 с.
Fillmore Ch. The Case for case. In Bach, Harms. Universals in Linguistic Theory. — New York, 1968.
Попов Э. В. Общение с ЭВМ на естественном языке. — М.: Наука, 1982. — 37 c.
Schank R. Conceptual Information Processing. — Amsterdam, North-Holland, 1975.
Скороходько Э. Ф. Семантические сети и автоматическая обработка текста. — Киев, 1983. — 112 с.
Schank R., Birnbaum L., May J. Integrating semantics and pragmatics. — «Quaderni di Semantica », 1985. — Vol. VI, no. 2.
Lowerre B., Reddy D. The Harpy speech understanding system. — Pittsburgh: Carnegie — Mellon University, 1976.
Minsky M. A Framework for Representing Knowledge. — Mass.: Artificial Intelligence Laboratory, Cambridge, 1974.
Winston P. Artificial intelligence. — London: Addison-Wesley Publishing Company, 1977.
Schank R., Abelson R. Scripts, Plans, Goals and Understanding. — Hillsdale, NJ: Lawrence Erlbaum, 1977.
Винцюк Т. К. Распознавание слов устной речи методами динамического программирования. — М.: Кибернетика, 1968. — №1, 15–22 с.
Bellman R. E. Dynamic programming. — Princeton, New Jersey: Princeton University Press, USA, 1957.
Forsyth R. Expert systems. Principles and case studies. — London: Chapman and Hall, 1984.
Zadeh L. «A fuzzy-algorithmic approach to the definition of complex or imprecise concepts». — In international Journal of Man-Machine Studies, 1976. — vol. 8, No. 3, pp.249–291.
Shortliffe E. Computer based Medical Consultations: MYCIN. — New York: American Elsevier, 1976.
Selfridge M. Integrated Processing Produces Robust Understanding. — Computational Linguistics // Vol. 12, №2, April-June 1986. — pp.89–106.
Представление и использование знаний / под ред. Х. Уэно, М. Исидзука. – М: Мир, 1989. –220 с.
Zampolli A. Interview for Elsnews on Language Resources Conference. — Elsnews № 7.3, 1998.
Wang Ye-Yi A robust parser for spoken language understanding. — Budapest, Hungary: Eurospeech’ 99, 1999.
Wang J., Wang H., Lee K., Huang C. Domain-unconstrained language understanding Based on CKIP-Auto Tag, How-net, and ART // Proceedings of ICSLP’2000, Beijing, China, 2000.
Huang Y., Zheng F., Xu M., Yan P., Wu W. Language Understanding Component for Chinese Dialogue System // Proceedings of ICSLP’2000, Beijing, China, 2000.
Lucke Helmut Interface of stochastic context-free grammar rules from example data using the theory of Bayesian belief // The Proc. of Eurospeech93, 1993. — pp. 1195–1198.
Pearl J. Probabilistic Reasoning in Intelligent Systems. — Morgan&Kaufmann, 1987.
Minker W. Stochastically-Based Natural Spoken Language Understanding Across Task and Languages. Spoken language Processing Group. — LIMSI-CNRS. EUROSPEECH’97, 1997 — p.p. 1423-1426.
Dreyfus H. L. What Computers Can’t Do. A critique of artificial reason. — Harper & Row, Publishers, 1973.
Swerts M., Litman D., Hirschberg J. Corrections in Spoken Dialogue Systems // Proceedings of ICSLP’2000, Beijing, China, 2000.
Akinori I., Chiori H., Masaharu K., Masaki K. Language Modeling by Stochastic Dependency Grammar for Japanese Speech Recognition // Proceedings of ICSLP’2000, Beijing, China, 2000.
Seward A. A Tree-Trellis N-best Decoder for Stochastic Context-Free Grammars // Proceedings of ICSLP’2000, Beijing, China, 2000.
Carpenter B., Lerner S., Pieraccin R. Optimizing BNF Grammars through Source Transformations // Proceedings of ICSLP’2000, Beijing, China, 2000.
Potamianos A., Kuo H. Statistical Recursive Finite State Machine Parsing for Speech Understanding // Proceedings of ICSLP’2000, Beijing, China, 2000.
Esteve Y., Bechet F., R. de Mori Dynamic Selection of Language Models in a Dialogue System // Proceedings of ICSLP’2000 , Beijing, China, 2000.
Lin Y., Wan H. Error-tolerant Language Understanding for Spoken Dialogue Systems // Proceedings of ICSLP’2000, Beijing, China, 2000.
Wu C., Chen Y., Yang C. Error Recovery and Sentence Verification Using Statistical Partial Pattern Tree for Conversational Speech // Proceedings of ICSLP’2000, Beijing, China, 2000.
Kuo H., Lee C. Discriminative Training in Natiral Language Call Routing // Proceedings of ICSLP’2000, Beijing, China, 2000.
Chou W., Zhou Q., Kuo H., Saad A., Attwater D., Durston P., Farrell M., Scahill F. Natural Language Call Steering for Service Applications // Proceedings of ICSLP’2000, Beijing, China, 2000.
Lai Y., Lee K., Wu C. Intention Extraction and Semantic Matching for Internet FAQ Retrieval Using Spoken Language Query // Proceedings of ICSLP’2000, Beijing, China, 2000.
Johnsen M., Holter T., Svendsen T., Harborg E. Stochastic Modeling of Semantic Content for Use in a Spoken Dialogue System // Proceedings of ICSLP’2000, Beijing, China, 2000.
Horiuchi Y., Arsushi F., Ichikawa A. New WWW Browser for Visually Impaired People Using Interactive Voice Technology. — Budapest, Hungary: In Proc. Of Eurospeech’99, 1999. — pp. 2139–2142.
Pokrovski N. B. Calculation and Measurement of Speech Legibility. — Moscow: Svjaz., 1962.
Psychological and Psycho-physiological Research of Speech / Edited by Ushakova T. N. — Moscow: Nauka // 1985.
Lindsay P. H., Norman D. A. Human Information Processing. — NY and London: Academic Press, 1972.
Cognition and the symbolic processes/Edited by Weimer W.. Palermo D. — Hillsdale, 1974.
Oaksford M., Chater N. Against logistics cognitive science. — In Mind &Language, 1991. — vol. 6, No. 1, pp. 2-37.
Lyons J. Introduction to theoretical linguistics. — Cambridge: At the University Press, 1972.
Kravez L. G. Quantitative Merkmale englischer Nominalverbindungen // Sprachstatistik. Mit zahlreichen Skizzen, Tabellen und Schemata im Text. Uebersetzt von einem Kollektiv unter Leitung von Lothar Hoffman. Wilhelm Fink, Muenchen/Salzburg, 1973. — pp. 252–264
Danejko M. Maschkina L., Nechaj O., Sorkina W., Saharanda A. Statiatische Untersuchung der lexikalischen Distribution der Wortformen // Sprachstatistik. Mit zahlreichen Skizzen, Tabellen und Schemata im Text. Uebersetzt von einem Kollektiv unter Leitung von Lothar Hoffman. Wilhelm Fink, Muenchen/Salzburg, 1973. — pp. 239–251.
Deese J. On the structure of associative meaning // Psychological review, 1962. — vol. 69, No. 2, pp. 161–175.
Howes D. On the relation between the probability of a word as an association and in general verbal usage // Journal of Abnormal and Social Psychology, 1957. — vol. 54, No. 1, pp. 75–86.
Kosarev Yu. A., Jarov P. A. Associations help to recognize words // Proceedings of DAGA-95, Saarbruecken, 1995. — pp. 979–982.
Bekwith R., Fellbaum C., Gross D, Miller G. WordNet: A lexical database organized on psycholinguistic principle // Using On-line Resources to Build a Lexicon. Zernic U. (ed.),NJ: Lawrence Erlbaum, Hillsdale, 1992.
Kosarev Yu., Piotrowski R. Synergetics and 'Insight' Strategy for Speech Processing. Literary and Linguistic Computing — Oxford University Press, 1997. — Vol. 12, pp. 113–118.
Kosarev Yu. Achievements and Challenges in Speech Dialogue with Computer. — St- Petersburg: Proc. Intern.Workshop "Speech and Computer", SPECOM'2000, 2000 — pp. 1–7.
Hirasawa J., Miyzaki N., Nakano M., Aikawa K. New Feature Parameters for Detecting Misunderstanding in Spoken Dialogue System // Proceedings of ICSLP’2000, Beijing, China, 2000.
Rahim M., Pieaccini R., Eckert W., Levin E., Di Fabbrizio G., Riccardi G., Kamm C., Narayanan S. A Spoken Dialogue System for Conference / Workshop Services // Proceedings of ICSLP’2000, Beijing, China, 2000.
Wang H. M., Lin Y. C. Coal-oriented Table-driven Design for Dialog manager // Proceedings of ICSLP’2000, Beijing, China, 2000.
Kurematsu A.,Akegam Y.,Burge S.,Jekat S.,Lause B., Maclaren V., Oppermann D., Schultz T. VERBMOBIL Dialogues: Multifaced Analysis // Proceedings of ICSLP’2000, Beijing, China, 2000.
Zhang H., Xu B., Huang T. How to Choose Training Set for Language Modelling // Proceedings of ICSLP’2000, Beijing, China, 2000.
Luo X., Franz M. Semantic Tokenization of Verbalized Numbers in Language Modeling // Proceedings of ICSLP’2000, Beijing, China, 2000.
Miller G., Isard S. Some Perceptual Consequences of Linguistic Rules, J. of Verbal Learning and Verbal Behavior, 1963. — 2, pp. 217–228.
Jelinek F. The Development of an Experimental of Discrete Dictation Recognizer // Proceedings of IEEE, No. 11, vol. 73, 1985
Bahl L. R. et al. Performance if the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task // Proceedings of ICASSP, 1995. — pp. 41–44.
Leonardi F., Micca G., Militello S.,Nigra M. Preliminary results of multilingual interactive voice activated telephone service for people-on-the-move // Proceedings of EUROSPEECH’97, 1997. — vol. 4, pp. 1771–1774.
Geutner P., Arevalo L., Breuninger J. VODIS — Voice-Operated Driver Information Systems: A Usability Study on Advanced Speech Technologies for Car Environments // Proceedings of ICSLP’2000, Beijing, China, 2000.
Vilar J., Llorens D, Vidal E. Experiments with Finite-State Models for Speech-Input Language Translation // Proceedings of SPECOM’96. — St-Petersburg, 1996. — pp. 59–63.
Yokoo A., Sagisaka Y., Campbell N., Iida H., Yamamoto S. ATR-MATRIX: Speech Translation System from Japanese to English // Proceedings of SPECOM’98. — St-Petersburg, 1998. — pp. 203–206.
Pickles J. O. An Introduction to the Physiology of Hearing. — New York: Academic press, USA, 1988.
Markel J., Gray A. Linear Prediction of Speech. — New York: Springer-Verlag, USA, 1980.
Rabiner L. R., Schafer R. W. Digital Processing of Speech Signals. — New Jersey: Prentice- Hall, Englewood Cliffs, USA, 1978.
Davis S., Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences // Proceedings of ASSP’28, 1980. — pp. 357– 366.
Zwicker E., Terhardt E. Analytical expressions for critical-band rate and critical bandwidth as a function of frequency // Journal of the Acoustical Society of America, 1980. — vol. 68, No. 5, pp. 1523–1525
Rabiner L., Juang B. Fundamentals of Speech Recognition. — New Jersey: Prentice-Hall, Englewood Cliffs, USA, 1993.
Strom N. Continuous Speech Recognition in the WAXHOLM Dialogue System. — STL QPSR, 1996. — pp. 67–96.
Markel J., Gray A. Linear Prediction of Speech. — Berlin, Springer-Verlag, 1976.
Makhoul J., Raucos S., Gish H. Vector Quantization In Speech Coding // Proceedings of IEEE, 1985. — vol. 73, No. 11, pp. 1551–1588.
Johnson S. C. Hierarchical clustering schemes. Psychometrika. — 1967. — 32, pp. 241–254.
King B. F. Step-wise clustering procedures // Journal of the American Statistical Association, 1967. — 62, pp. 86–101.
MacQueen J. B. Some methods for classification and analysis of multivariate observations // Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. — 1967, pp. 281–297.
Ball G., Hall D. ISODATA, A Novel Method of Data Analysis and Patten Classification. — (AD 699616) California, Stanford Research Institute, 1965.
Sakoe H, Chiba S. Dynamic programming optimization for spoken word recognition. — IEEE Trans. ASSP-34, 1986. — No. 1, pp. 52–59.
Sakoe H, Chiba S. Recognition of Continuously Spoken Words based on Time-Normalization by Dynamic Programming // J. Acoust. Soc. Japan, 1971 — 7, 9, pp. 483–490.
Jelinek F. A fast sequential decoding algorithm using stack. — IBM J. Res. Develop., 1969. — No13: 675–685.
Levinson S. E. "Structural Methods in Automatic Speech Recognition." — In Proceedings of the IEEE, 1985. — vol. 73, no. 11, pp. 1625–1650.
Bakis R. Continuous speech word recognition via centisecond acoustic states. — Washington: In Proc. ASA Meeting, 1976.
Lee K. F., Hon H. W. Large-vocabulary speaker-independent continuous speech recognition // Proc. IEEE Int Conf. On Acoustic, Speech, and Signal Processing, 1988. — pp. 123–126.
Bahl L. R., Jelinek F., Mercer R. A maximum likelihood approach to continuous speech recognition. — IEEE Trans. Pattern Anal. Machine Intell., 1983. — vol. PAMI-5, pp. 179–190.
Viterbi A. J. Error bounds for convolutionalcodes and an asymmetrically optimum decoding algorithm. — IEEE Transactions on Information Theory, 1967. — vol. IT-13, pp. 260–267.
Sakoe H. Two-Level DP Matching — A Dynamic Programming-Based Pattern Matching Algorithm for Connected Word Recognition. — IEEE Trans. ASSP-27, 1979. — No. 6, pp. 588– 595.
Myers C. S., Rabiner L. R. A Level Building Dynamic Time Warping Algorithm for Connected Word Recognition. — IEEE Trans. ASSP-29, 1981. — No. 2, pp. 284–297.
Vintsiuk T. K. Element-Wise Recognition of Continuous Speech Consisting of Words from a Specified Vocabulary. — Kibernetika, 1971. — No. 2, pp. 133–143.
Bellegarda J., Silverman K. Toward Unconstrained Command and Control: Data-Driven Semantic Interface // Proceedings of ICSLP’2000, Beijing, China, 2000.
Bonneau-Maynard H., Devillers L. A Framework for Evaluating Contextual Understanding // Proceedings of ICSLP’2000, Beijing, China, 2000.
Austin J. L. How to do things with words. — New York :Oxford University Press // 1973.
Chomsky N. On certain formal properties of grammars. — Inform. Control 2, 1959. — pp. 137– 167.
Городецкий Б. Ю. Компьютерная лингвистика: Моделирование языкового общения // Новое в зарубежной лингвистике. Вып. 24. — М.: Прогресс, 1989. — 5–32 с.
Pfeifer R., Scheier Ch. Understanding Intelligence. — Cambridge, MA: The MIT Press. Bradford Books, 1999.
Налимов В. В. Непрерывность против дискретности в языке и мышлении. — Тбилиси: Изд. Тбилисского университета, 1978.
Косарев Ю. А. Кодовая модель речевого сообщения. — Препринт ЛИИАН.Л., 1986. — 23 с.
Fillmore Ch. The Case for case. In Bach, Harms. Universals in Linguistic Theory. — New York, 1968.
Попов Э. В. Общение с ЭВМ на естественном языке. — М.: Наука, 1982. — 37 c.
Schank R. Conceptual Information Processing. — Amsterdam, North-Holland, 1975.
Скороходько Э. Ф. Семантические сети и автоматическая обработка текста. — Киев, 1983. — 112 с.
Schank R., Birnbaum L., May J. Integrating semantics and pragmatics. — «Quaderni di Semantica », 1985. — Vol. VI, no. 2.
Lowerre B., Reddy D. The Harpy speech understanding system. — Pittsburgh: Carnegie — Mellon University, 1976.
Minsky M. A Framework for Representing Knowledge. — Mass.: Artificial Intelligence Laboratory, Cambridge, 1974.
Winston P. Artificial intelligence. — London: Addison-Wesley Publishing Company, 1977.
Schank R., Abelson R. Scripts, Plans, Goals and Understanding. — Hillsdale, NJ: Lawrence Erlbaum, 1977.
Винцюк Т. К. Распознавание слов устной речи методами динамического программирования. — М.: Кибернетика, 1968. — №1, 15–22 с.
Bellman R. E. Dynamic programming. — Princeton, New Jersey: Princeton University Press, USA, 1957.
Forsyth R. Expert systems. Principles and case studies. — London: Chapman and Hall, 1984.
Zadeh L. «A fuzzy-algorithmic approach to the definition of complex or imprecise concepts». — In international Journal of Man-Machine Studies, 1976. — vol. 8, No. 3, pp.249–291.
Shortliffe E. Computer based Medical Consultations: MYCIN. — New York: American Elsevier, 1976.
Selfridge M. Integrated Processing Produces Robust Understanding. — Computational Linguistics // Vol. 12, №2, April-June 1986. — pp.89–106.
Представление и использование знаний / под ред. Х. Уэно, М. Исидзука. – М: Мир, 1989. –220 с.
Zampolli A. Interview for Elsnews on Language Resources Conference. — Elsnews № 7.3, 1998.
Wang Ye-Yi A robust parser for spoken language understanding. — Budapest, Hungary: Eurospeech’ 99, 1999.
Wang J., Wang H., Lee K., Huang C. Domain-unconstrained language understanding Based on CKIP-Auto Tag, How-net, and ART // Proceedings of ICSLP’2000, Beijing, China, 2000.
Huang Y., Zheng F., Xu M., Yan P., Wu W. Language Understanding Component for Chinese Dialogue System // Proceedings of ICSLP’2000, Beijing, China, 2000.
Lucke Helmut Interface of stochastic context-free grammar rules from example data using the theory of Bayesian belief // The Proc. of Eurospeech93, 1993. — pp. 1195–1198.
Pearl J. Probabilistic Reasoning in Intelligent Systems. — Morgan&Kaufmann, 1987.
Minker W. Stochastically-Based Natural Spoken Language Understanding Across Task and Languages. Spoken language Processing Group. — LIMSI-CNRS. EUROSPEECH’97, 1997 — p.p. 1423-1426.
Dreyfus H. L. What Computers Can’t Do. A critique of artificial reason. — Harper & Row, Publishers, 1973.
Swerts M., Litman D., Hirschberg J. Corrections in Spoken Dialogue Systems // Proceedings of ICSLP’2000, Beijing, China, 2000.
Akinori I., Chiori H., Masaharu K., Masaki K. Language Modeling by Stochastic Dependency Grammar for Japanese Speech Recognition // Proceedings of ICSLP’2000, Beijing, China, 2000.
Seward A. A Tree-Trellis N-best Decoder for Stochastic Context-Free Grammars // Proceedings of ICSLP’2000, Beijing, China, 2000.
Carpenter B., Lerner S., Pieraccin R. Optimizing BNF Grammars through Source Transformations // Proceedings of ICSLP’2000, Beijing, China, 2000.
Potamianos A., Kuo H. Statistical Recursive Finite State Machine Parsing for Speech Understanding // Proceedings of ICSLP’2000, Beijing, China, 2000.
Esteve Y., Bechet F., R. de Mori Dynamic Selection of Language Models in a Dialogue System // Proceedings of ICSLP’2000 , Beijing, China, 2000.
Lin Y., Wan H. Error-tolerant Language Understanding for Spoken Dialogue Systems // Proceedings of ICSLP’2000, Beijing, China, 2000.
Wu C., Chen Y., Yang C. Error Recovery and Sentence Verification Using Statistical Partial Pattern Tree for Conversational Speech // Proceedings of ICSLP’2000, Beijing, China, 2000.
Kuo H., Lee C. Discriminative Training in Natiral Language Call Routing // Proceedings of ICSLP’2000, Beijing, China, 2000.
Chou W., Zhou Q., Kuo H., Saad A., Attwater D., Durston P., Farrell M., Scahill F. Natural Language Call Steering for Service Applications // Proceedings of ICSLP’2000, Beijing, China, 2000.
Lai Y., Lee K., Wu C. Intention Extraction and Semantic Matching for Internet FAQ Retrieval Using Spoken Language Query // Proceedings of ICSLP’2000, Beijing, China, 2000.
Johnsen M., Holter T., Svendsen T., Harborg E. Stochastic Modeling of Semantic Content for Use in a Spoken Dialogue System // Proceedings of ICSLP’2000, Beijing, China, 2000.
Horiuchi Y., Arsushi F., Ichikawa A. New WWW Browser for Visually Impaired People Using Interactive Voice Technology. — Budapest, Hungary: In Proc. Of Eurospeech’99, 1999. — pp. 2139–2142.
Pokrovski N. B. Calculation and Measurement of Speech Legibility. — Moscow: Svjaz., 1962.
Psychological and Psycho-physiological Research of Speech / Edited by Ushakova T. N. — Moscow: Nauka // 1985.
Lindsay P. H., Norman D. A. Human Information Processing. — NY and London: Academic Press, 1972.
Cognition and the symbolic processes/Edited by Weimer W.. Palermo D. — Hillsdale, 1974.
Oaksford M., Chater N. Against logistics cognitive science. — In Mind &Language, 1991. — vol. 6, No. 1, pp. 2-37.
Lyons J. Introduction to theoretical linguistics. — Cambridge: At the University Press, 1972.
Kravez L. G. Quantitative Merkmale englischer Nominalverbindungen // Sprachstatistik. Mit zahlreichen Skizzen, Tabellen und Schemata im Text. Uebersetzt von einem Kollektiv unter Leitung von Lothar Hoffman. Wilhelm Fink, Muenchen/Salzburg, 1973. — pp. 252–264
Danejko M. Maschkina L., Nechaj O., Sorkina W., Saharanda A. Statiatische Untersuchung der lexikalischen Distribution der Wortformen // Sprachstatistik. Mit zahlreichen Skizzen, Tabellen und Schemata im Text. Uebersetzt von einem Kollektiv unter Leitung von Lothar Hoffman. Wilhelm Fink, Muenchen/Salzburg, 1973. — pp. 239–251.
Deese J. On the structure of associative meaning // Psychological review, 1962. — vol. 69, No. 2, pp. 161–175.
Howes D. On the relation between the probability of a word as an association and in general verbal usage // Journal of Abnormal and Social Psychology, 1957. — vol. 54, No. 1, pp. 75–86.
Kosarev Yu. A., Jarov P. A. Associations help to recognize words // Proceedings of DAGA-95, Saarbruecken, 1995. — pp. 979–982.
Bekwith R., Fellbaum C., Gross D, Miller G. WordNet: A lexical database organized on psycholinguistic principle // Using On-line Resources to Build a Lexicon. Zernic U. (ed.),NJ: Lawrence Erlbaum, Hillsdale, 1992.
Kosarev Yu., Piotrowski R. Synergetics and 'Insight' Strategy for Speech Processing. Literary and Linguistic Computing — Oxford University Press, 1997. — Vol. 12, pp. 113–118.
Kosarev Yu. Achievements and Challenges in Speech Dialogue with Computer. — St- Petersburg: Proc. Intern.Workshop "Speech and Computer", SPECOM'2000, 2000 — pp. 1–7.
Hirasawa J., Miyzaki N., Nakano M., Aikawa K. New Feature Parameters for Detecting Misunderstanding in Spoken Dialogue System // Proceedings of ICSLP’2000, Beijing, China, 2000.
Rahim M., Pieaccini R., Eckert W., Levin E., Di Fabbrizio G., Riccardi G., Kamm C., Narayanan S. A Spoken Dialogue System for Conference / Workshop Services // Proceedings of ICSLP’2000, Beijing, China, 2000.
Wang H. M., Lin Y. C. Coal-oriented Table-driven Design for Dialog manager // Proceedings of ICSLP’2000, Beijing, China, 2000.
Kurematsu A.,Akegam Y.,Burge S.,Jekat S.,Lause B., Maclaren V., Oppermann D., Schultz T. VERBMOBIL Dialogues: Multifaced Analysis // Proceedings of ICSLP’2000, Beijing, China, 2000.
Zhang H., Xu B., Huang T. How to Choose Training Set for Language Modelling // Proceedings of ICSLP’2000, Beijing, China, 2000.
Luo X., Franz M. Semantic Tokenization of Verbalized Numbers in Language Modeling // Proceedings of ICSLP’2000, Beijing, China, 2000.
Miller G., Isard S. Some Perceptual Consequences of Linguistic Rules, J. of Verbal Learning and Verbal Behavior, 1963. — 2, pp. 217–228.
Jelinek F. The Development of an Experimental of Discrete Dictation Recognizer // Proceedings of IEEE, No. 11, vol. 73, 1985
Bahl L. R. et al. Performance if the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task // Proceedings of ICASSP, 1995. — pp. 41–44.
Leonardi F., Micca G., Militello S.,Nigra M. Preliminary results of multilingual interactive voice activated telephone service for people-on-the-move // Proceedings of EUROSPEECH’97, 1997. — vol. 4, pp. 1771–1774.
Geutner P., Arevalo L., Breuninger J. VODIS — Voice-Operated Driver Information Systems: A Usability Study on Advanced Speech Technologies for Car Environments // Proceedings of ICSLP’2000, Beijing, China, 2000.
Vilar J., Llorens D, Vidal E. Experiments with Finite-State Models for Speech-Input Language Translation // Proceedings of SPECOM’96. — St-Petersburg, 1996. — pp. 59–63.
Yokoo A., Sagisaka Y., Campbell N., Iida H., Yamamoto S. ATR-MATRIX: Speech Translation System from Japanese to English // Proceedings of SPECOM’98. — St-Petersburg, 1998. — pp. 203–206.
Pickles J. O. An Introduction to the Physiology of Hearing. — New York: Academic press, USA, 1988.
Markel J., Gray A. Linear Prediction of Speech. — New York: Springer-Verlag, USA, 1980.
Rabiner L. R., Schafer R. W. Digital Processing of Speech Signals. — New Jersey: Prentice- Hall, Englewood Cliffs, USA, 1978.
Davis S., Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences // Proceedings of ASSP’28, 1980. — pp. 357– 366.
Zwicker E., Terhardt E. Analytical expressions for critical-band rate and critical bandwidth as a function of frequency // Journal of the Acoustical Society of America, 1980. — vol. 68, No. 5, pp. 1523–1525
Rabiner L., Juang B. Fundamentals of Speech Recognition. — New Jersey: Prentice-Hall, Englewood Cliffs, USA, 1993.
Strom N. Continuous Speech Recognition in the WAXHOLM Dialogue System. — STL QPSR, 1996. — pp. 67–96.
Markel J., Gray A. Linear Prediction of Speech. — Berlin, Springer-Verlag, 1976.
Makhoul J., Raucos S., Gish H. Vector Quantization In Speech Coding // Proceedings of IEEE, 1985. — vol. 73, No. 11, pp. 1551–1588.
Johnson S. C. Hierarchical clustering schemes. Psychometrika. — 1967. — 32, pp. 241–254.
King B. F. Step-wise clustering procedures // Journal of the American Statistical Association, 1967. — 62, pp. 86–101.
MacQueen J. B. Some methods for classification and analysis of multivariate observations // Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. — 1967, pp. 281–297.
Ball G., Hall D. ISODATA, A Novel Method of Data Analysis and Patten Classification. — (AD 699616) California, Stanford Research Institute, 1965.
Sakoe H, Chiba S. Dynamic programming optimization for spoken word recognition. — IEEE Trans. ASSP-34, 1986. — No. 1, pp. 52–59.
Sakoe H, Chiba S. Recognition of Continuously Spoken Words based on Time-Normalization by Dynamic Programming // J. Acoust. Soc. Japan, 1971 — 7, 9, pp. 483–490.
Jelinek F. A fast sequential decoding algorithm using stack. — IBM J. Res. Develop., 1969. — No13: 675–685.
Levinson S. E. "Structural Methods in Automatic Speech Recognition." — In Proceedings of the IEEE, 1985. — vol. 73, no. 11, pp. 1625–1650.
Bakis R. Continuous speech word recognition via centisecond acoustic states. — Washington: In Proc. ASA Meeting, 1976.
Lee K. F., Hon H. W. Large-vocabulary speaker-independent continuous speech recognition // Proc. IEEE Int Conf. On Acoustic, Speech, and Signal Processing, 1988. — pp. 123–126.
Bahl L. R., Jelinek F., Mercer R. A maximum likelihood approach to continuous speech recognition. — IEEE Trans. Pattern Anal. Machine Intell., 1983. — vol. PAMI-5, pp. 179–190.
Viterbi A. J. Error bounds for convolutionalcodes and an asymmetrically optimum decoding algorithm. — IEEE Transactions on Information Theory, 1967. — vol. IT-13, pp. 260–267.
Sakoe H. Two-Level DP Matching — A Dynamic Programming-Based Pattern Matching Algorithm for Connected Word Recognition. — IEEE Trans. ASSP-27, 1979. — No. 6, pp. 588– 595.
Myers C. S., Rabiner L. R. A Level Building Dynamic Time Warping Algorithm for Connected Word Recognition. — IEEE Trans. ASSP-29, 1981. — No. 2, pp. 284–297.
Vintsiuk T. K. Element-Wise Recognition of Continuous Speech Consisting of Words from a Specified Vocabulary. — Kibernetika, 1971. — No. 2, pp. 133–143.
Bellegarda J., Silverman K. Toward Unconstrained Command and Control: Data-Driven Semantic Interface // Proceedings of ICSLP’2000, Beijing, China, 2000.
Bonneau-Maynard H., Devillers L. A Framework for Evaluating Contextual Understanding // Proceedings of ICSLP’2000, Beijing, China, 2000.
Опубликован
2002-04-01
Как цитировать
Косарев, Ли, Ронжин, Скиданов, & Savage,. (2002). Обзор методов понимания речи и текста. Труды СПИИРАН, 2(1), 157-195. https://doi.org/10.15622/sp.1.12
Раздел
Статьи
Авторы, которые публикуются в данном журнале, соглашаются со следующими условиями:
Авторы сохраняют за собой авторские права на работу и передают журналу право первой публикации вместе с работой, одновременно лицензируя ее на условиях Creative Commons Attribution License, которая позволяет другим распространять данную работу с обязательным указанием авторства данной работы и ссылкой на оригинальную публикацию в этом журнале.
Авторы сохраняют право заключать отдельные, дополнительные контрактные соглашения на неэксклюзивное распространение версии работы, опубликованной этим журналом (например, разместить ее в университетском хранилище или опубликовать ее в книге), со ссылкой на оригинальную публикацию в этом журнале.
Авторам разрешается размещать их работу в сети Интернет (например, в университетском хранилище или на их персональном веб-сайте) до и во время процесса рассмотрения ее данным журналом, так как это может привести к продуктивному обсуждению, а также к большему количеству ссылок на данную опубликованную работу (Смотри The Effect of Open Access).