Research Article

A Breast Cancer Prediction Model Based on a Panel from Circulating Exosomal miRNAs

Figure 2

Machine learning-based diagnostic using 16 exo-miRNAs. (a) We used the R software glmnet package with the parameter family set to binomial to implement Lasso logistic regression and selected strongly correlated features. Using 5-fold cross-validation, the best performance was obtained at the highest point of the curve AUC and the penalty term coefficient Lambda. Min (0.057). Lambda.lse (0.109) is to choose a simpler model without significantly reducing the performance of the model. (b) This figure is a penalty plot of 1962 miRNA coefficients. As the penalty coefficient Lambda changes, the coefficients of more and more variables are compressed to 0, and 16 miRNAs are selected when Lambda is 0.057. (c,d) These pictures show the distribution of the training set (c) and test set (d) in different categories of samples and the accuracy of Lasso regression. (e,f) The ROC curves of the training set (e) and the test set (f) in different models, the area under the curve represents the model AUC value.
(a)
(b)
(c)
(d)
(e)
(f)