Quantity:
-
Three paradigms have dominated machine translation (MT)—rule-based machine translation (RBMT), statistical machine translation (SMT), and example-based machine translation (EBMT). These paradigms differ in the way they handle the three fundamental processes in MT—analysis, transfer, and generation (ATG). In its pure form, RBMT uses rules, while SMT uses data. EBMT tries a combination—data supplies translation parts that rules recombine to produce translation.
Machine Translation compares and contrasts the salient principles and practices of RBMT, SMT, and EBMT. Offering an exposition of language phenomena followed by modeling and experimentation, the text:
Machine Translation is designed for advanced undergraduate-level and graduate-level courses in machine translation and natural language processing. The book also makes a handy professional reference for computer engineers.
List of Figures
List of Tables
Preface
Acknowledgments
About the Author
Introduction
A Feel for a Modern Approach to Machine Translation: Data-Driven MT
MT Approaches: Vauquois Triangle
Understanding Transfer over the Vauquois Triangle
Understanding Ascending and Descending Transfer
Language Divergence with Illustration between Hindi and English
Syntactic Divergence
Lexical-Semantic Divergence
Three Major Paradigms of Machine Translation
MT Evaluation
Adequacy and Fluency
Automatic Evaluation of MT Output
Summary
Further Reading
Learning Bilingual Word Mappings
A Combinatorial Argument
Necessary and Sufficient Conditions for Deterministic Alignment in Case of One-to-One Word Mapping
A Naïve Estimate for Corpora Requirement
Deeper Look at One-to-One Alignment
Drawing Parallels with Part of Speech Tagging
Heuristics-Based Computation of the VE × VF Table
Iterative (EM-Based) Computation of the VE × VF Table
Initialization and Iteration 1 of EM
Iteration 2
Iteration 3
Mathematics of Alignment
A Few Illustrative Problems to Clarify Application of EM
Derivation of Alignment Probabilities
Expressing the E- and M-Steps in Count Form
Complexity Considerations
Storage
Time
EM: Study of Progress in Parameter Values
Necessity of at Least Two Sentences
One-Same-Rest-Changed Situation
One-Changed-Rest-Same Situation
Summary
Further Reading
IBM Model of Alignment
Factors Influencing P(f|e)
Alignment Factor a
Length Factor m
IBM Model 1
The Problem of Summation over Product in IBM Model 1
EM for Computing P(f|e)
Alignment in a New Input Sentence Pair
Translating a New Sentence in IBM Model 1: Decoding
IBM Model 2
EM for Computing P(f|e) in IBM Model 2
Justification for and Linguistic Viability of P(i|j,l,m)
IBM Model 3
Summary
Further Reading
Phrase-Based Machine Translation
Need for Phrase Alignment
Case of Promotional/Demotional Divergence
Case of Multiword (Includes Idioms)
Phrases Are Not Necessarily Linguistic Phrases
An Example to Illustrate Phrase Alignment Technique
Two-Way Alignments
Symmetrization
Expansion of Aligned Words to Phrases
Phrase Table
Mathematics of Phrase-Based SMT
Understanding Phrase-Based Translation through an Example
Deriving Translation Model and Calculating Translation and Distortion Probabilities
Giving Different Weights to Model Parameters
Fixing λ Values: Tuning
Decoding
Example to Illustrate Decoding
Moses
Installing Moses
Workflow for Building a Phrase-Based SMT System
Preprocessing for Moses
Training Language Model
Training Phrase Model
Tuning
Decoding Test Data
Evaluation Metric
More on Moses
Summary
Further Reading
Rule-Based Machine Translation (RBMT)
Two Kinds of RBMT: Interlingua and Transfer
What Exactly Is Interlingua?
Illustration of Different Levels of Transfer
Universal Networking Language (UNL)
Illustration of UNL
UNL Expressions as Binary Predicates
Why UNL?
Interlingua and Word Knowledge
How Universal Are UWs?
UWs and Multilinguality
UWs and Multiwords
UW Dictionary and Wordnet
Comparing and Contrasting UW Dictionary and Wordnet
Translation Using Interlingua
Illustration of Analysis and Generation
Details of English-to-UNL Conversion: With Illustration
Illustrated UNL Generation
UNL-to-Hindi Conversion: With Illustration
Function Word Insertion
Case Identification and Morphology Generation
Representative Rules for Function Words Insertion
Syntax Planning
Transfer-Based MT
What Exactly Are Transfer Rules?
Case Study of Marathi-Hindi Transfer-Based MT
Krudant: The Crux of the Matter in M-H MT
M-H MT System
Summary
Further Reading
Example-Based Machine Translation
Illustration of Essential Steps of EBMT
Deeper Look at EBMT’s Working
Word Matching
Matching of Have
EBMT and Case-Based Reasoning
Text Similarity Computation
Word Based Similarity
Tree and Graph Based Similarity
CBR’s Similarity Computation Adapted to EBMT
Recombination: Adaptation on Retrieved Examples
Based on Sentence Parts
Based on Properties of Sentence Parts
Recombination Using Parts of Semantic Graph
EBMT and Translation Memory
EBMT and SMT
Summary
Further Reading
Index