The ability of zero-shot translation emerges when we train a multilingual model with certain translation directions; the model can then directly translate in unseen directions. Alternatively, zero-shot translation can be accomplished by pivoting through a third language (e.g., English). In our work, we observe that both direct and pivot translations are noisy and achieve less satisfactory performance. We propose EBBS, an ensemble method with a novel bi-level beam search algorithm, where each ensemble component explores its own prediction step by step at the lower level but they are synchronized by a “soft voting” mechanism at the upper level. Results on two popular multilingual translation datasets show that EBBS consistently outperforms direct and pivot translations as well as existing ensemble techniques. Further, we can distill the ensemble’s knowledge back to the multilingual model to improve inference efficiency; profoundly, our EBBS-based distillation does not sacrifice, or even improves, the translation quality.
Bibtex
@misc{wen2024ebbs,
title={EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation},
author={Yuqiao Wen and Behzad Shayegh and Chenyang Huang and Yanshuai Cao and Lili Mou},
year={2024},
eprint={2403.00144},
archivePrefix={arXiv},
primaryClass={id=’cs.CL’ full_name=’Computation and Language’ is_active=True alt_name=’cmp-lg’ in_archive=’cs’ is_general=False description=’Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.’}
}
Related Research
-
Identifying and Addressing Delusions for Target-Directed Decision-Making
Identifying and Addressing Delusions for Target-Directed Decision-Making
M. Zhao, T. Sylvain, D. Precup, and Y. Bengio. Workshop at Conference on Neural Information Processing System (NeurIPS)
Publications
-
Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models
Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models
S. Mahdavi, R. Aoki, K. Tang, and Y. Cao. Conference on Neural Information Processing System (NeurIPS)
Publications
-
Jump Starting Bandits with LLM-Generated Prior Knowledge
Jump Starting Bandits with LLM-Generated Prior Knowledge
P. A. Alamdari, Y. Cao, and K. Wilson. Conference on Empirical Methods in Natural Language Processing
Publications