Legal-BigBird: An Adapted Long-Range Transformer for Legal Documents

Published in Black in AI Workshop NeurIPS 2021, 2021

Recommended citation: Loic Kwae Dassi. Legal-BigBird: An Adapted Long-Range Transformer for Legal Documents. NeurIPS 2021 Workshop: Black in AI, 2021. https://www.researchgate.net/profile/Loic-Kwate-Dassi/publication/356930560_Legal-BigBird_An_Adapted_Long-Range_Transformer_for_Legal_Documents/links/61b5fdee63bbd9324289b54e/Legal-BigBird-An-Adapted-Long-Range-Transformer-for-Legal-Documents.pdf

Download paper here

The legal domain is attracting considerable attention in natural language processing (NLP) due to the number of legal documents generated (contracts, business deals, etc.) throughout professional activities and the logical business processing required on that documents. Treat legal documents is particularly cumbersome due to the context-specific knowledge and its extensive length. BigBird has achieved significant performance both on the computational side and on learning representation in the long-range arena. Few researchers have investigated the ability of long-range Transformer models to tackle the knowledge representation problem in the legal domain. We present in this work an adaptation of the long-range Transformerbased model BigBird on legal domain complemented with a use case in legal case retrieval. We continued the training of BigBird with the self-supervised learning task masked language modeling on legal corpora. Without fine-tuning, we tested the pre-trained models on legal case retrieval. We showed that adapting BigBird on legal corpora improves the knowledge representation of documents and outperforms by 5 in accuracy score the vanilla BigBird on the same task.