As the labeling costs in legal artificial intelligence tasks are expensive. Therefore, it becomes a challenge to utilize low cost to train a robust model. In this paper, we propose a LAIAugment approach, which aims to enhance the few-shot learning capability in legal artificial intelligence tasks. Specifically, we first use the self-training approach to label the amount of unlabelled data to enhance the feature learning capability of the model. Moreover, we also search for datasets that are simi…
Read moreAs the labeling costs in legal artificial intelligence tasks are expensive. Therefore, it becomes a challenge to utilize low cost to train a robust model. In this paper, we propose a LAIAugment approach, which aims to enhance the few-shot learning capability in legal artificial intelligence tasks. Specifically, we first use the self-training approach to label the amount of unlabelled data to enhance the feature learning capability of the model. Moreover, we also search for datasets that are similar to the training set by improving the text similarity function. We conducted experimental analyses for three legal artificial intelligence tasks, including evidence extraction, legal element extraction, and case multi-label prediction, which composed of 3500 judgement documents. The experimental results show that the proposed LAIAugment method has an average F1-score of 72.3% on the three legal AI tasks, which is 1.93% higher than the baseline model. At the same time, it shows a huge improvement in few-shot learning.