Modern Cross-Lingual Information Extraction

Luis Fernando Guzman Nateras

Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing research. However, a vast majority of such efforts remain limited to a small set of languages. With 7000+ languages spoken around the world, this unbalanced focus leaves marginalized communities unable to take advantage of such technological innovations. Cross-Lingual Learning looks to address this inequality by transferring knowledge from a high-resource source language into a low-resource target language. This paper provides a survey of recent Cross-Lingual efforts for Information Extraction (CLIE). We first provide some background on the resources leveraged by cross-lingual methods and the knowledge-transfer paradigms that characterize them. Then state-of-the-art methods are organized into a taxonomy based on the information extraction sub-task they tackle and the knowledge-transfer archetype they employ. Finally, we discuss several suitable directions for future CLIE research efforts.