CROSS-LINGUAL ABILITY OF MULTILINGUAL BERT: AN EMPIRICAL STUDY
Multilingual BERT (M-BERT) has shown surprising cross lingual abilities — even when it is trained without cross lingual objectives. In this work, we analyze what causes this multilinguality from three factors: linguistic properties of the languages, the architecture of the model, and the learning objectives.
Linguistic properties:
Architecture:
Learning Objectives:
Please refer to our paper for more details.
If you would like to pre-train a BERT with Fake language/permuted sentences, see preprocessing-scripts for how to create the tfrecords for BERT training.
Once you have uploaded the tfrecords to google cloud, you can set up an instance and start BERT training via bert-running-scripts.
With models we provide or just trained, we provide code for evaluating on two tasks, NER and entailment. See evaluating-scripts.
We are releasing the following BERT models:
See data for detailed paths to download the models.
Please cite the following paper if you find our paper useful. Thanks!
Karthikeyan K, Zihan Wang, Stephen Mayhew, Dan Roth. “Cross-Lingual Ability of Multilingual BERT: An Empirical Study” arXiv preprint arXiv:1912.07840 (2019).
@article{wang2019cross,
title={Cross-Lingual Ability of Multilingual BERT: An Empirical Study},
author={K, Karthikeyan and Wang, Zihan and Mayhew, Stephen and Roth, Dan},
journal={arXiv preprint arXiv:1912.07840},
year={2019}
}