Article published in:
Intonational Grammar in Ibero-Romance: Approaches across linguistic subfieldsEdited by Meghan E. Armstrong, Nicholas Henriksen and Maria del Mar Vanrell
[Issues in Hispanic and Lusophone Linguistics 6] 2016
► pp. 227–248
Towards automatic language processing and intonational labeling in European Portuguese
Helena Moniz | Faculdade de Letras da Universidade de Lisboa
Fernando Batista | Instituto de Engenharia de Sistemas e Computadores – Investigação e Desenvolvimento em Lisboa
Ana Isabel Mata | Faculdade de Letras da Universidade de Lisboa
Isabel Trancoso | Instituto de Engenharia de Sistemas e Computadores – Investigação e Desenvolvimento em Lisboa
This work describes a framework that encompasses multi-layered linguistic information, focusing on prosodic features (pitch, energy, and tempo patterns), uses such features to distinguish between sentence-form types and disfluency/fluency repairs, and contributes to the characterization of intonational patterns of spontaneous and prepared speech in European Portuguese. Different machine learning methods have been applied for discriminating between structural metadata events, both in university lectures and in map-task dialogues, containing large amounts of spontaneous speech. Results show that prosodic features, and particularly a set of very informative features, are crucial to distinguish between sentence-form types and disfluency/fluency repair events. This is the first work for European Portuguese on both fully automatic processing of multi-layered linguistically description of spoken corpora and intonational labeling.
Keywords: European Portuguese, prosody, speech processing, structural metadata
Published online: 31 March 2016
https://doi.org/10.1075/ihll.6.11mon
https://doi.org/10.1075/ihll.6.11mon
References
References
Abad, A., & Neto, J.
(2008) Incorporating acoustical modelling of phone transitions in a hybrid ANN/HMM speech recognizer. In
Proceedings of Interspeech 2008
(pp. 2394-2397), Brisbane, Australia.
Aiken, E., Thomas, G., & Shennum, W.
Anderson, A., Bader, M., Bard, E., Boyle, E., Doherty, G.M., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H.S., & Weinert, R.
Batista, F., Moniz, H., Trancoso, I., & Mamede, N.
Batista, F., Moniz, H., Trancoso, I., Mamede, N., & Mata, A.I.
Beckman, M., & Pierrehumbert, J.
Beckman, M., Hirschberg, J., & Shattuck-Hufnagel, S.
Breiman, L., Friedman, J., Olshen, R., & Stone, C.
Burger, S., & Sloane, Z.
(2004) The ISL meeting corpus: Categorical features of communicative group Interactions. In
Proceedings of Rich Transcription 2004, Spring Meeting Recognition Workshop
, Montreal, Canada.
Campbell, N.
Christensen, H., Gotoh, Y., & Renals, S.
(2001) Punctuation annotation using statistical prosody models. In
Proceedings of ASRU 2001
(pp. 35-40), Madonna di Campiglio, Italy.
Cruz-Ferreira, M.
Eklund, R.
Falé, I.
Falé, I., & Faria, I.
Favre, B., Hakkani-Tür, D., & Shriberg, E.
(2009) Syntactically-informed models for comma prediction. In Proceedings of ICASSP'09 (pp. 4697-4700), Taipei, Taiwan.
Frota, S.
Garrido, J., Escudero, D., Aguilar, L., Cardeñoso, V., Rodero, E., de la Mota, C., González, C., Vivaracho, C., Rustullet, S., Larrea, O., Laplaza, Y., Vizcaíno, F., Estebas, E., Cabrera, M., & Bonafonte, A.
Heeman, P., & Allen, J.
Hindle, D.
(1983) Deterministic parsing of syntactic non-fluencies. In Proceedings of ACL-83 (pp. 123-128),Cambridge, MA.
Jun, S.-A.
Kim, J., Schwarm, S.E., & Ostendorf, M.
(2004) Detecting structural metadata with decision trees and transformation-based learning. In
Proceedings of HLT-NAACL 2004
(pp. 137-144), New York, NY.
Kolár, J., Liu Y., & Shriberg, E.
(2009) Genre effects on automatic sentence segmentation of speech: a comparison of broadcast news and broadcast conversations. In
Proceedings of ICA SSP 2009
(470-4704). Taipei, Taiwan.
Kolár, J., & Liu, Y.
(2010) Automatic sentence boundary detection in conversational speech: A cross-lingual evaluation on English and Czech. In
Proceedings of ICASSP 2010
(pp. 5258-5261), Dallas, TX.
Liu, Y., Shriberg, E., Stolcke, A., Dustin, H., Ostendorf, M., & Harper, M.
Makhoul, J., Kubala, F., Schwartz, R., & Weischedel, R.
(1999) Performance measures for information extraction. In
Proceedings of the DARPA Broadcast News Workshop
(pp. 249-252), Herndon, VA.
Mata, A.I.
Mertens, P.
Moniz, H., Batista, F., Trancoso, I., & Mata, A.I.
(2012) Prosodic context-based analysis of disfluencies. In Proceedings of Interspeech 2012 (pp. 1961-1964), Portland, OR.
Moniz, H.
Nakatani, C., & Hirschberg, J.
Neto, J., Meinedo, H., Viveiros, M., Cassaca, R., Martins, C., & Caseiro, D.
(2008) Broadcast news subtitling system in Portuguese.
Proceedings of ICASSP’08
(pp. 1561-1564), Las Vegas, NV.
Ostendorf, M., Favre, B., Grishman, R., Hakkani-Tür, D., Harper, M., Hillard, D., Hirschberg, J., Ji, H., Kahn, J., Liu, Y., Makey, S., Matusov, E., Ney, H., Rosenberg, A., Shriberg, E., Wang, W., & Wooters, C.
Pellegrini, T., Moniz, H., Batista, F., Trancoso, I., & Astudillo, R.
(2012) Extension of the LECTRA corpus: Classroom LECture TRAnscriptions in European Portuguese. In
Proceedings of Speech and Corpora
(pp. 98-102), Belo Horizonte, Brazil.
Pierrehumbert, J., & Hirschberg, J.
Rosenberg, A.
Shattuck-Hufnagel, S., & Turk, A.
Shriberg, E.
Shriberg, E., Stolcke, A., Hakkani-Tür, D., & Tür, G.
Shriberg, E., Favre, B., Fung, J., Hakkani-Tür, D., & Cuendet, S.
Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., & Hirschberg, J.
(1992) ToBI: A standard for labeling English prosody. In
Proceedings of CSLP'98
(pp. 867-870), Banff, Canada.
Sjölander, K., Beskow, J., Gustafson, J., Lewin, E., Carlson, R., & Granström, B.
(1998) Web-based educational tools for speech technology. In
Proceedings of ICSLP 1998
(pp. 3217-3220), Sydney, Australia.
Trancoso, I., Martins, R., Moniz, H., Mata, A.I., & Viana, M.C.
Trancoso, I., Viana, M.C., Duarte, I., & Matos, G.
(1998) Corpus de Diálogo CORAL. In
Proceedings of PROPOR'98
, Porto Alegre, Brasil.
Vaissière, J.
Viana, M.C.
Viana, M.C., Frota, S., Falé, I., Fernandes, F., Mascarenhas, I., Mata, A.I., Moniz, H., & Vigário, M.
(2007) Towards a P_ToBI. In
Proceedings of PaPI 2007
, Minho, Portugal.