Syllabus for LING 580

Seminar on Machine Translation

Winter 2006

 

Instructor:                                                                       Fei Xia

Time & Location:                                                            T 3:30-5:50pm, JHN 111

 

Office Hours:                                                                   Friday: 10:30am – 12:30pm

Office Phone:                                                                  (206) 543-9764

Email:                                                                              fxia at u

                                                                                                             (include "Ling580" in the subject line).

 

Course website: http://faculty.washington.edu/fxia/courses/LING580MT-Winter2006.shtml                                                                         

 

 

 

Course Description:

In this seminar, we will discuss important papers on machine translation, and focus on statistical MT and transfer-based MT. Students will gain hands-on experience by experimenting with various methods to improve a phrase-based SMT system.

 

 

Course Texts:

  None.  Reading materials are online.

 

 

Prerequisites:

  LING 570 and LING 571

  Stat 391 (Prob. and Stats for CS) or equivalent

  Programming:

-         C/C++

-         basic unix/linux commands (e.g., ls, cd, ln, sort, head):  tutorials on unix

-         Perl (optional): tutorials on Perl

 

 

Grading:

   Leading discussion (50%): each student will present a paper in class.

   Term paper (40%).

   Class participation (10%).

 

 

 

Schedule:

 

Week

Date

Topic

Reading

 

1

1/3

Introduction to MT

Word-based SMT

 

 

2

1/10

 

Structural divergence

-         Dorr (1994): by Fei

-         Hwa (2002):  by Jeremy

Dorr (CL, 1994)

Hwa et. al.  (ACL, 2002)

 

 

3

1/17

 

Phrase-based SMT (I)

- Marcu and Wong (2002): by Ping

- Koehn et. al. (2003): by Al

Marcu and Wong  (EMNLP, 2002)

Koehn et. al.  (NAACL, 2003)

 

4

1/24

 

Phrase-based SMT (II)

-         Och and Ney (2002): by Bill

-         Och (2003): by Anna

Och and Ney  (ACL, 2002)

Och (ACL, 2003)

5

1/31

 

Transfer-based SMT (I)

      -   Wu (1997): by David

-   Chiang (2005): by Achim

Wu (1997)

Chiang (ACL, 2005)

6

2/7

 

Transfer-based MT (II)

-         Yamada and Knight (2001): Gabriel

-         Graehl and Knight (2004): Zhengbo

Yamada and Knight (ACL, 2001)

 

Graehl and Knight (NAACL, 2004)

7

2/14

 

Transfer-based MT (III)

- Alshawi et al. (2000): Yow

- Lin (2004): Joshua

Alshawi et. al.  (CL, 2000)

Lin (COLING, 2004)

8

2/21

 

Transfer-based MT (IV)

-         Gildea (2003): Scott

-         Gildea (2004): Michael

Gildea (ACL, 2003)

Gildea (EMNLP, 2004)

9

2/28

 

Hybrid MT:

-         Och et. al. (2003): Part 1:  Shauna

-         Och et. al. (2003): Part 2: Ethen

 

Och et. al. (manuscript, 2003a)

 

10

3/7

 

Transfer-based MT (V)

-         Burbank et. al. (manuscript, 2005a): Sauleh

 

Burbank et. al. (manuscript, 2005a)

 

 

Reading materials (required):

 

  • Divergences:
    • Bonnie Dorr (CL, 1994): Machine Translation Divergences: A Formal Description and Proposed Solution. Journal of Computational Linguistics, 1994.
    • Hwa et. al.  (ACL, 2002): Evaluating Translational Correspondence using Annotation Projection, In Proceedings of the 40th Annual Conference of the Association for Computational Linguistics (ACL-02), 2002.

 

  • Phrase-based SMT:
    • Marcu and Wong  (EMNLP, 2002): A Phrase-Based, Joint Probability Model for Statistical Machine Translation, In EMNLP-02, 2002.
    • Koehn et. al.  (NAACL, 2003): Statistical Phrase-Based Translation,  In Proceedings of the 2003 Meeting of the North American chapter of the Association for Computational Linguistics (NAACL-03), Edmonton, Alberta, 2003.
    • Och and Ney (ACL, 2002): Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the 40th Annual Conference of the Association for Computational Linguistics (ACL-02), Philadelphia, PA, 2002.
    • Och (ACL, 2003): Minimum Error Rate Training for Statistical Machine Translation. In Proceedings of the 41th Annual Conference of the Association for Computational Linguistics (ACL-03), 2003.

 

 

 

 

 

 

Reference papers: (not required)

 

 

  • Word-based SMT:
    • Brown et. al. (CL, 1993): The Mathematics of Statistical Machine Translation: Parameter estimation. Computational Linguistics, 19(2):263-311, 1993.

 

 

  • Decoding
    • Knight (CL, 1999): Decoding Complexity in Word-Replacement Translation Models. Computational Linguistics, 25(4), 1999.
    • Tillmann and Ney (CL, 2003): Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation. Computational Linguistics, 29:97-133, 2003.
    • Germann et. al. (ACL, 2001): Fast Decoding and Optimal Decoding for Machine Translation. Proceedings of the Conference of the Association for Computational Linguistics (ACL-2001), Toulouse, France, July 2001.
    • Germann et. al. (NAACL, 2003): Greedy Decoding for Statistical Machine Translation in Almost Linear Time. Proceedings of HLT-NAACL 2003. Edmonton, AB, Canada.

 

  • Transfer-based MT:
    • Wu and Wong (ACL, 1998): Machine Translation with a Stochastic Grammatical Channel. In Proceedings of the 17th International Conference on Computational Linguistics (COLING/ACL-98), 1998.

 

  • Example-based MT:
    • Sumita and Iida (ACL, 1991): Experiments and Prospects of Example-Based Machine Translation, Eiichiro Sumita and Hitoshi Iida. In 29th Annual Meeting of the Association for Computational Linguistics, 1991.
    • Brown (COLING, 1996): Example-Based Machine Translation in the Pangloss System, Ralf D. Brown. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), 1996.