Abstract
Smart court technologies are making full use of modern science to promote the modernization of the trial system and trial capabilities, for example, artificial intelligence, Internet of things, and cloud computing. The smart court technologies can improve the efficiency of case handling and achieving convenience for the people. Article recommendation is an important part of intelligent trial. For ordinary people without legal background, the traditional information retrieval system that searches laws and regulations based on keywords is not applicable because they do not have the ability to extract professional legal vocabulary from complex case processes. This paper proposes a law recommendation framework, called LawRec, based on Bidirectional Encoder Representation from Transformers (BERT) and Skip-Recurrent Neural Network (Skip-RNN) models. It intends to integrate the knowledge of legal provisions with the case description and uses the BERT model to learn the case description text and legal knowledge, respectively. At last, laws and regulations for cases can be recommended. Experiment results show that the proposed LawRec can achieve better performance than state-of-the-art methods.
1. Introduction
Artificial intelligence technology has flourished in both academy and industry. Face recognition, voice recognition, and other intelligence technologies are developing rapidly [1]. Intelligent products such as smart speakers and sweeping robots have entered thousands of households. Smart court technologies are making full use of modern science such as the artificial intelligence, Internet of things, big data, and cloud computing to promote the modernization of the trial system and trial capabilities, thereby improving the efficiency of case handling and achieving convenience for the people [2].
With the step-by-step advancement of the court’s informatization process, the record carrier of case information and adjudication process has been transformed from paper to electronic filing [3]. Relying on the rapid development of the Internet, case records are not limited to a certain court, city, or province, but they have a nationwide network of judgment documents. These conditions have led to the creation of a huge library of judicial documents with standardized formats. The judgment document is the record and summary of the case, the facts of the case, the trial process, and the basis of the trial after the judge completes the trial [4]. It contains a large amount of data information. These accumulated judgment documents have become a powerful data support for legal research, providing a good data foundation for subsequent intelligence.
Article recommendation is an important part of intelligent trial. Because the law is the basis for the outcome of the trial, the judge must handle the case in accordance with the law. Therefore, the statutes represent the direction of the trial of the case to a certain extent. In addition, the value of legal recommendations is also reflected in the help they can provide to various roles involved in legal cases [5]. For judges trying cases, if they can learn from the trial information of similar cases in the past, they can handle cases more efficiently. For lawyers who defend the plaintiff and the defendant, if they can quickly find the applicable laws and regulations from a variety of laws and regulations, they can better defend their clients with stronger arguments. For plaintiffs and defendants who lack legal knowledge, without the help of professionals, they have no way of knowing whether there are suitable statutes to protect their rights and interests, and a system that can correctly predict statutes can help them save time and money in legal consultation [6].
For ordinary people without legal background, the traditional information retrieval system that searches laws and regulations based on keywords is not applicable because they do not have the ability to extract professional legal vocabulary from complex case processes [7]. The Bidirectional Encoder Representation from Transformers (BERT) model has powerful text representation and text understanding capabilities. This model has been widely used in semantic understanding-based fields, such as entity recognition, text classification, and other fields, but it is rarely used in the field of legal recommendation. This paper proposes a law recommendation framework, called LawRec based on BERT and Skip-Recurrent Neural Network (Skip-RNN) models [8], which intends to integrate the knowledge of legal provisions with the case description and uses the BERT model to learn the case description text and legal knowledge, respectively. At last, laws and regulations for cases can be recommended.
The paper structure is as follows: Section 2 introduced the related work of law recommendation. Section 3 describes the LawRec framework. Section 4 gives the experiment analysis and results. Section 5 makes a summarization.
2. Related Work
At present, the research on judicial intelligence at home and abroad has achieved certain results. Work [9] constructed a dataset of 2.6 million criminal cases for trial prediction, including case facts as input and three predictors of citations, charges, and jail time. Reference [10] proposed a model for predicting whether a court will uphold or overturn a judgment. By analyzing the lawyer’s historical case handling and court trial performance, the lawyer is scored, and then the lawyer is recommended according to the current case type. There are also studies conducted on criminal cases. Work [11] used a bidirectional Gated Recurrent Unit (GRU) model to predict criminal charges based on court-finding facts and legal grounds. Work [12] extracted logical basis from case facts through reinforcement learning, which enhanced the interpretability of crime prediction. Work [13] regarded the court opinion as the interpretation of the crime and used the conditional seq2seq model to generate the judge’s judgment analysis process according to the criminal facts of the criminal case.
In terms of recommendation algorithms, recommender systems first appeared in the 1990s [14, 15], which provide users with suggestions through historical information analysis and help users quickly find useful information. Collaborative filtering [16] is one of the most widely used algorithms in the field of recommender systems, involving social, shopping, finance, law, and other fields. Work [17] proposed a collaborative filtering-based network news system to help people find favorite articles in a large stream of information. Work [18] proposed content-based collaborative filtering to solve the problem that the workload of traditional methods increases with the increase of system participants.
Considering the recommendations of the law, some scholars have conducted part of the research. Most of them are aimed at expert users such as judges and lawyers and focus on information retrieval or keyword-based classification. How to make computers understand the meaning of natural language correctly has always been a topic of academic research. In recent years, the research of neural network algorithm has made breakthrough progress in this area. Work [19] applied neural network to lexical error correction. Work [20] proposed a neural network combining dynamic pooling and recurrent autoencoders for paraphrase detection. Work [21] used CNN for text classification and achieved better results than other models. Work [22] designed a court judgment evaluation model. The evaluation model is based on BP neural network. Work [23] designed a model based on bidirectional long short-term memory networks. The model can recognize legal text.
3. LawRec: BERT-Based Law Recommendation Framework
The Bidirectional Encoder Representation from Transformers (BERT) model has powerful text representation and text understanding capabilities. This model has been widely used in semantic understanding-based named entity recognition, text classification, and other fields, but it is rarely used in the field of legal recommendation. This paper proposes a law recommendation framework, called LawRec, based on BERT and Skip-RNN models, which intends to integrate the knowledge of legal provisions with the case description and uses the BERT model to learn the case description text and legal knowledge, respectively. At last, laws and regulations for cases can be recommended.
This paper proposes a law recommendation method based on knowledge fusion in the field of judicial law. The overall structure is shown in Figure 1. The proposed model includes rule extraction of laws, BERT training, and rule recommendation of laws. The legal knowledge extraction layer extracts keywords from the legal knowledge in the judicial field to obtain the legal knowledge. The BERT model performs semantic representation of the case description text and legal knowledge based on the Skip-RNN. Therefore, the semantic representation vector can be obtained. The legal rule knowledge integration layer is mainly based on the attention mechanism. The legal rule knowledge integration layer can realize the feature fusion of legal rule knowledge features and case description. And, the case description feature vector fused with legal rule knowledge can be obtained. The legal recommendation layer is like the traditional legal recommendation framework and adopts the idea of text classification to achieve the final legal recommendation.

3.1. Feature Extraction
The legal provisions for specific types of cases are generally long. To accurately locate the core knowledge of legal provisions, this paper extracts the keywords of legal provisions and finally obtains the core knowledge of legal provisions, which is convenient for subsequent follow-up.
3.2. BERT Model for Feature Modelling
For the text description and legal knowledge of a specific case, where represents the length of the case description text, is the length of the legal text knowledge, this paper uses the BERT model to characterize them, respectively. Based on the BERT model, we get the specific text description vector which is as follows:where and represent the BERT-based case text description vector and legal rule knowledge representation vector, respectively. To improve the continuous representation ability of text sequence information, a Skip-RNN layer is added after the BERT pretraining module. For longer sequences, Skip-RNN adds a skip gate, which outputs the number of steps to be jumped according to the current state, thereby speeding up the training. Skip-RNN can learn forward and backward information, improve the contextual and contextual feature information extraction capabilities of text feature vectors, and solve long-distance dependencies. Specifically,
Among them, and are the forward and backward outputs of the case description text hidden layer, respectively, and and are the forward and backward outputs of the legal knowledge hidden layer, respectively.
3.3. Legal Knowledge Representation
To enhance the importance of legal article knowledge, attention mechanism is usually used to fuse legal article knowledge and case description. Finally, a case description that integrates legal knowledge can be obtained. The attention calculation formula of case description feature and legal knowledge feature is as follows:where represents the feature of the -th text described by the text. Then, normalize the knowledge features of legal articles and the feature attention of each case text, and the specific formula is as follows:where represents the attention vector of the -th text describing the knowledge features of external legal articles and the facts of the case.
Finally, the case description text features are weighted and summed based on the attention weight to obtain the case description vector fused with legal knowledge. The attention mechanism can focus on useful information and ignore unimportant information. The principle of this mechanism is to calculate the weight corresponding to the information. The greater the weight, the more important the information is.
3.4. Law Recommendation
Like previous legal recommendation methods, the prediction process is still divided into three steps: (1) describe the fusion case features and legal knowledge, (2) perform linear transformation, and (3) use softmax to achieve prediction. is the prediction result:
This paper uses cross-entropy loss to minimize the prediction error between the output result and the label. The cross-entropy loss formula is as follows:where is the label vector predicted by the model in this paper, is the labeled normal label vector, and is the regularization term.
4. Experimental Results and Analysis
4.1. Experimental Dataset
In the Fayan Cup dataset, an experiment was conducted on the legal article recommendation task [24]. In order to achieve a relatively balanced dataset, this paper deleted some low-frequency legal articles in the Fayan Cup dataset and deleted some invalid samples and stop words, and finally, the training set selected in this paper is 800,000, and the validation set and test set are each 50,000. The number of law labels selected in this paper is about 1.1 million. The scale of the data set is shown in Table 1.
Since this article uses the fact description and the law part of the data, the law recommendation data includes the case fact description text and the specific law label. The specific form of the data is shown in Table 2.
Like the traditional law recommendation task, this paper uses the value as the evaluation index:where is the precision rate and is the recall rate.
4.2. Model Construction and Experimental Parameter Settings
The model used in this paper is built with PyTorch, and the specific parameters are designed as follows: the word vector dimension is 320, the number of Skip-RNN hidden layer units is 440, the learning rate is 0.002, the dropout is set to 0.6 to prevent overfitting, and the batch size is 64.
4.3. Comparative Model and Analysis of Experimental Results
To prove the effectiveness of the method proposed in this paper, we compare and analyze the three aspects of traditional law recommendation method, law knowledge ablation, and BERT pretraining model ablation.(a)Transformer [25]: it has achieved very good results in the field of machine translation.(b)SVM [26]: it was first used to solve the two-classification problem in pattern recognition, and it has achieved good classification results in the fields of text classification, handwriting recognition, and image processing.(c)TextRnn [27]: it is a model that uses RNN for text classification.(d)FastText [28]: its biggest feature is that the model is simple, the training speed is very fast, and it is widely used in the field of text classification.(e)BERT [29]: it has strong text representation ability and achieves good results in various tasks of deep learning.(f)Text CNN [30]: it is a typical model using CNN for text classification.
The specific experimental results are shown in Table 3.
Experimental results show that the LawRec based on BERT significantly outperforms traditional data-driven methods in terms of the accuracy rate , recall rate , and values.
In order to verify the impact of the BERT pretraining model on the experimental performance, this paper uses the BERT pretraining model and the word2vec representation model to conduct a comparison experiment on the legal recommendation task in the Fayan Cup public test data set. The specific experimental results are shown in Table 4.
It can be seen from Table 4 that the model based on BERT characterization can significantly improve the result of legal recommendation. This is because the BERT pretraining model has strong representation ability for legal knowledge and case description text, so it improves the legal recommendation’s performance.
In order to verify the impact of incorporating legal knowledge on the performance of legal recommendation, this paper conducts a comparative experiment of incorporating legal knowledge ablation, and the specific experimental results are shown in Table 5.
It can be seen from Table 5 that adding rule knowledge can greatly improve the F1 value of the testing set. This is because the fusion of legal article knowledge can improve the feature extraction performance of case text description to a certain extent, so that the extracted case text features are more inclined to legal article knowledge, so the performance of legal article recommendation is improved. The experimental results demonstrate the effectiveness of incorporating external knowledge of legal articles on the legal article recommendation task.
To further illustrate that the law recommendation model incorporating legal knowledge can effectively solve the problem of recommending confusing laws, here is a specific analysis based on Article 252 of the Criminal Law (crime of intentional injury) and Article 252 of the Criminal Law (crime of intentional homicide). Table 6 is a case of intentional injury that was mispredicted as intentional homicide in a model that did not incorporate legal knowledge. From the description of the case, we can see that there are a large number of keywords that distinguish the crime of intentional injury from the crime of intentional homicide, such as intentional injury, body, negligent death, cruel means, serious injury, serious disability, and other keywords in the crime of intentional injury. Therefore, we conclude that the legal recommendation model incorporating legal knowledge can accurately distinguish the crime of intentional injury and the crime of intentional homicide. This example shows to a certain extent that the legal recommendation model incorporating legal knowledge can effectively solve the problem of confusing legal recommendations.
To more intuitively illustrate the process of legal recommendation, an example analysis of fraud crime is given, as shown in Tables 6 and 7.
Based on the combination of keywords (e.g., “public and private property” and “large amount of money”) and the attention mechanism, the attention of fraud crimes can be increased. This paper combines legal knowledge and case description text to achieve targeted feature extraction, thereby achieving accurate legal recommendation.
5. Conclusion
The traditional information retrieval system based on keyword search for laws and regulations is not suitable for ordinary people without professional legal knowledge. Therefore, it is necessary to propose a legal recommendation framework to help them extract professional legal vocabulary from complex case processes. This paper proposed a law recommendation framework, called LawRec, based on BERT and Skip-RNN models, which intends to integrate the knowledge of legal provisions with the case description and uses the BERT model to learn the case description text and legal knowledge, respectively. At last, laws and regulations for cases can be recommended. Experiment results show that the proposed LawRec can achieve better performance than state-of-the-art methods. The accuracy of LawRec is 92%, which is 12% higher than that of the model that does not incorporate legal knowledge.
Data Availability
The labeled datasets used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
No potential conflicts of interest was reported by the authors.