"LASER achieves these results by embedding all languages jointly in a single shared space (rather than having a separate model for each)". There could be a good reason for why the mutual embedding of several languages works better than individual, beyond the extra data. If human languages share some minimal representation (universality so to say), training on multiple languages may be required to extract it with today's techniques, since training on just one language is bound to overfit to its particulars.