Reading Groups

Machine Learning
Tone

Machine Learning

November 15, 2004: Language Models

Language models: Cluster-based retrieval using language models.
Xiaoyong Liu, W. Bruce Croft, July 2004, Proceedings of the 27th annual International Conference on Research and Development in Information Retrieval.
http://ciir.cs.umass.edu/pubfiles/ir-347.pdf

The model is laid out pretty directly, and we get a direct comparison to other retrieval methods.

http://trec.nist.gov/pubs/trec10/papers/cmu-dir-lemur-trec10-final.pdf

There's quite a bit of recent LM/IR stuff in the SIGIR (03-04) proceedings as well, if there's anything that catches someone's eye.

November 22, 2004: Deviation from Randomness

Main Paper

http://search.fub.it/claudio/pdf/CLEF2003.pdf

More theoretical

http://ir.dcs.gla.ac.uk/terrier/publications/p357-amati.pdf

Random indexing

http://www.sics.se/~mange/papers/coling2004.pdf

Maxent

Wang, S., Schuurmans, D. and Zhao, Y. (2003) The latent maximum entropy principle.
http://www.cs.ualberta.ca/~dale/papers/lme.ps.gz

Related papers by similar authors can be found at the same site

http://www.cs.ualberta.ca/~dale/papers.html

November 29, 2004: Iterative Residual Rescaling

Iterative Residual Rescaling: An Analysis and Generalization of {LSI}
- http://www.cs.cornell.edu/home/llee/papers/ando-lee-sigir01.home.html

December 6, 2004: Spoken Document Retrieval

Document Expansion for Speech Retrieval. Amit Singhal, Fernando Pereira. ACM SIGIR'99, pages 26-33, 1999.
SCAN: Designing and Evaluating User Interfaces to Support Retrieval from Speech Archives. Steve Whittaker, Julia Hirschberg, John Choi, Don Hindle, Fernando Pereira, Amit Singhal. ACM SIGIR'99, 34-41, 1999.

Search conversational speech

http://www.clsp.jhu.edu/research/malach/pubs/sdr_0203.pdf
Document Expansion for Speech Retrieval. Amit Singhal, Fernando Pereira. ACM SIGIR'99, pages 26-33, 1999.

December 13, 2004: Drago Radev's visit

Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization, Barzilay and Lee (HLT-NAACL 2004).

http://acl.ldc.upenn.edu/hlt-naacl2004/main/pdf/167_Paper.pdf

E. Hovy and Chen-Yew Lin, Automated Text Summarization in SUMMARIST, in Advances in Automatic Text Summarization, 1999.

http://citeseer.ist.psu.edu/cache/papers/cs/22816/http:zSzzSzwww.isi.eduzSznatural-languagezSzpeoplezSzhovyzSzpaperszSz98hovylin-summarist.pdf/hovy99automated.pdf

Dragomir R. Radev; Eduard Hovy; Kathleen McKeown, Introduction to the Special Issue on Summarization

http://acl.ldc.upenn.edu/J/J02/

Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres

http://acl.ldc.upenn.edu/J/J02/J02-4003.pdf

January 3, 2005: NIPS report out

January 10, 2005: Dialogue Act Tagging

A. Venkataraman, L. Ferrer, A. Stolcke, & E. Shriberg (2003), Training a Prosody-based Dialog Act Tagger from Unlabeled Data. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, Hong Kong.

http://www.speech.sri.com/cgi-bin/run-distill?ftp:papers/icassp2003-dialog.ps.gz

Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., Van Ess-Dykema, C. (1998). Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? In M. Swerts and J. Hirschberg (eds.) Special Double Issue on Prosody and Conversation. Language and Speech 41(3-4), 439-487.

http://www.speech.sri.com/cgi-bin/run-distill?ftp:papers/LangSpeech98-prosody.ps.gz

Dialog Act Classification With The Help Of Prosody (1996) Int. Conf. on Spoken Language Processing, volume 3, pages 1728--1731, Philadelphia, 1996. Mast et al

http://citeseer.ist.psu.edu/7744.html

Helen Wright, Massimo Poesio, and Stephen Isard. 1999. Using high level dialogue information for dialogue act recognition using prosodic features. In Proceedings of the ESCA Workshop on Prosody and Dialogue, Eindhoven, September 1999.

http://citeseer.ist.psu.edu/wright99using.html

January 17, 2005: Latent Dirichlet Allocation: Techniques and Application

Latent Dirichlet Allocation (2003), David M. Blei, Andrew Y. Ng, Michael I. Jordan, In: NIPS*14. (2002)

http://citeseer.ist.psu.edu/blei03latent.html

On an Equivalence between PLSI and LDA (2003), Mark Girolami, Ata Kaban, In Proceedings of SIGIR 2003.

http://citeseer.ist.psu.edu/blei03latent.html

Integrating Topics and Syntax, Thomas L. Griffiths, Mark Steyvers, David M. Blei, Joshua B. Tenenbaum, In: Advances in Neural Information Processing Systems, 17

http://www.cs.berkeley.edu/~blei/papers/syntax-semantics.pdf

D. Blei, Michael, and M. I. Jordan. Modeling annotated data. Proceedings of the 26th annual intetational ACM SIGIR conference

http://citeseer.ist.psu.edu/712990.html

Monday February 7, 2005: Recognizing and Recognition with Tone and Intonation

Chen & H-Johnson 2004, How Prosody improves Speech Recognition 2004

http://www.ifp.uiuc.edu/speech/pubs/2004/Chen_PDSR_SP04.pdf

Xuejing Sun 2002, Pitch Accent Prediction Using Ensemble Machine Learning

http://citeseer.ist.psu.edu/525846.html

Kurt E. Dusterhoff. Automatic intonation analysis using acoustic data. In Proceedings, ESCA TRW on Dialogue and Prosody, Eindhoven, 1999

http://www.cstr.inf.ed.ac.uk/publications/

February 14, 2005: Max-Margin Methods

Max-Margin Parsing, Ben Taskar, Dan Klein, Mike Collins, Daphne Koller and Christopher Manning, Empirical Methods in Natural Language Processing - EMNLP '2004, p. 1-8

acl.ldc.upenn.edu/acl2004/emnlp/pdf/Taskar.pdf

Hidden Markov Support Vector Machines

http://www.cs.brown.edu/people/altun/pubs/AltTsoHof-ICML2003.pdf

Large Margin Methods for Label Sequence Learning

http://www.cs.brown.edu/people/altun/pubs/AltunHofmann-EuroSpeech2003.pdf

February 21, 2005: Discourse Structure

Derrick Higgins, Jill Burstein, Daniel Marcu, and Claudia Gentile (2004). Evaluating Multiple Aspects of Coherence in Student Essays In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL 2004), May 2-5, Boston, MA.
Daniel Marcu and Abdessamad Echihabi (2002). An Unsupervised Approach to Recognizing Discourse Relations. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002), Philadelphia, PA, July 7-12.
Radu Soricut and Daniel Marcu (2003). Sentence Level Discourse Parsing using Syntactic and Lexical Information. Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL), May 27-June 1, Edmonton, Canada.
Jill Burstein, Daniel Marcu, and Kevin Knight (2003). Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays. IEEE Intelligent Systems, pp. 32-39, Jan/Feb, 2003

Tone

November 13, 2004: PENTA Model

http://home.uchicago.edu/~xuyi/Xu_TAL2004.pdf
http://www.haskins.yale.edu/yixu/XUYI_Lang_Ling.pdf
Quantifies the PENTA model to some extent

http://home.uchicago.edu/~xuyi/xu_xu_luo_icphs99.pdf

Suggests what features the model incorporates in English (as opposed to Mandarin

http://www.haskins.yale.edu/yixu/Xu_Xu_ms.pdf

November 30, 2004: Tone & Focus

http://home.uchicago.edu/~xuyi/publications.html
Xu, Y. (1999). Effects of tone and focus on the formation and alignment of F0 contours. Journal of Phonetics, 27: 55-105.
Optional

Xu, Y. (1997). Contextual tonal variations in Mandarin. Journal of Phonetics, 25: 61-83.

December 2, 2004

Linguistics tone talk

Thursday, December 2
4 p.m. Stuart 105

Contrast enhancement in narrow focus and clear speech

Rajka Smiljanic
Northwestern University

ABSTRACT

In this talk, I will present data from two projects that examined how the phonological properties of a language (e.g., presence vs. absence of a phonemic contrast, the number of contrastive segments, phonemic vs. allophonic contrasts, etc.) and pragmatic factors (e.g., hyperarticulated/clear speech, narrow focus) condition speech production. I will first discuss the interaction between pragmatic narrow/contrastive focus and lexical contrast as factors that condition variation in pitch and duration in Croatian and Serbian. These two closely related dialects differ in the presence or absence of lexical pitch-accent and vowel length contrasts. It is found that lexical pitch-accent and vowel length contrasts influence the expression of pragmatic focus: the phonemic contrasts are enhanced in narrow focus rather than uniformly made more prominent.

December 9, 2004: Tone & Focus

Yi Xu's publications

http://home.uchicago.edu/~xuyi/publications.html

Xu, Y. (1999). Effects of tone and focus on the formation and alignment of F0 contours. Journal of Phonetics, 27: 55-105.
Xu, Y. (1997). Contextual tonal variations in Mandarin. Journal of Phonetics, 25: 61-83.

January 6, 2005

Chapters 1-3, Gussenhoven, phonology of tone & intonation

January 13, 2005

Gussenhoven, 4-5.

January 31, 2005: STEM-ML

Quantitative measurement and prediction of prosodic strength in Mandarin. Greg Kochanski, Chilin Shih and Hongyan Jing (2003). Speech Communication. V. 41, No. 4, pp. 625-645.
Hierarchical Structure and Word Strength Prediction of Mandarin Prosody. Greg Kochanski, Chilin Shih and Hongyan Jing (2003). International Journal of Speech Technology, 6 (1), pp. 33-43.
Automated Modelling of Chinese Intonation in Continuous Speech. Greg Kochanski and Chilin Shih (2001). Eurospeech, pp. 911-914, Aalborg, Denmark
- www.prosodies.org/tutorial2002/papers/automated_e2001.pdf

February 10, 2005

Visit with Yi Xu

February 17, 2005: Pitch Accent Prediction

I. Bulyko and M. Ostendorf. "A Bootstrapping Approach to Automating Prosodic Annotation for Constrained Domain Synthesis", in Proceedings of the IEEE Workshop on Speech Synthesis, 2002.

http://ssli.ee.washington.edu/ssli/people/bulyko/

K. Ross and M. Ostendorf, ``Prediction of Abstract Prosodic Labels for Speech Synthesis,'' Computer, Speech and Language, Vol. 10, No. 3, July 1996, pp. 155-185.

U of C library holding

Shimei Pan and Julia Hirschberg (2000), Modeling Local Context for Pitch Accent Prediction (PostScript file) or (PDF file) , ACL'2000, Hong Kong.

http://www1.cs.columbia.edu/~pan/

February 24, 2005: Tone Recognition for Chinese

Y. Qian, Tan Lee and Frank K. Soong "Tone Information as a Confidence Measure for Improving Cantonese LVCSR",8th International Conference on Spoken Language Processing, vol. III,pp.1965 - 1968, Jeji Island, Korea, October 2004.

http://dsp.ee.cuhk.edu.hk/general/Publications/

Yao Qian, Tan Lee and Frank K. Soong, "Use of tone information in continuous Cantonese speech recognition", in Proceedings of Speech Prosody 2004, pp.587- 590, Nara, Japan, March 2004

http://www.ee.cuhk.edu.hk/~tanlee/tanlee_pub.html

C. Wang and S. Seneff, "Improved Tone Recognition by Normalizing For Coarticulation and Intonation Effects." Proc. 6th International Conference on Spoken Language Processing, Beijing, China October 2000. (PDF)

http://www.sls.csail.mit.edu/sls/publications/

Tone Articulation Modeling for Mandarin Spontaneous Speech Recognition Jian-lai Zhou; Ye Tian; Yu Shi; Chao Huang; Eric Chang May 2004

http://research.microsoft.com/research/pubs/view.aspx?pubid=1289

March 10, 2005 and March 17: Implementing Pitch Target Approximation

Xuejing Sun's thesis: pp. 90-190

babel.ling.northwestern.edu/~jbp/thesis_xuejingsun.pdf

June 14, 2006: Multi-stream DBNs for Tone Recognition

Jeff Bilmes' paper on recognizing tonal phones with DBNs
Xin Lei, Gang Ji, Tim Ng, Jeff Bilmes, and Mari Ostendorf DBN-Based Multi-stream Models for Mandarin Toneme Recognition (gzipped ps or pdf) IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Philadelphia, PA, March 2005

http://ssli.ee.washington.edu/people/bilmes/mypapers/xl_icassp05.pdf

June 21, 2006: Minimal Supervision for Pitch Accent

Wong et al's paper on ME, self and co-training
Using Weakly Supervised Learning to Improve Prosody Labeling, D. Wong, M. Ostendorf and J. Kahn

https://www.ee.washington.edu/techsite/papers/refer/UWEETR-2005-0003.html