A New Computer-Aided Diagnosis System with Modified Genetic Feature Selection for BI-RADS Classification of Breast Masses in Mammograms
Table 4
Summary of the 46 selected features as the best feature subset using our modified genetic feature selection method.
BI-RADS category
Feature name
Feature significance
Shape
Irregularity Difference area: convex hull area minus mass area Variance variation Kurtosis variation Entropy variation
(i) Irregularity: this feature is extracted based on active contour method (aka snakes) [39]. When the edge path of a mammographic mass changes its direction, a point should be highlighted. The irregularity is therefore equal to the number of points along the margin. (ii) The remaining features (difference area: convex hull area minus mass area, variance variation, kurtosis variation, and entropy variation) are proposed in sonography by [40]. The convex hull is the smallest convex polygon that contains the mass contour and the mass region. The convex hull area is defined as the actual number of pixels in the convex hull of the mass, and the mass area is defined as the actual number of pixels in the mass region. To compute the other features, a function called “variation” which is the projection of the distance between the farthest pixels of a mass region at all angles is used, where variance, kurtosis, and entropy are statistical values of it.
Margin
Kurtosis “variance” kurtosis “Kurtosis” Entropy “minimum” Entropy “average” Entropy “variance” Entropy “kurtosis” Index of the maximum probability “minimum” Index of the maximum probability “maximum” Index of the maximum probability “variance”
(i) These features are quantified from a set of waveforms by wavelet analysis used in [41]. (ii) Margin kurtosis: it measures how much peaked is a probability distribution, and for well-defined margins, it gets higher values. In general, well-defined margins have an abrupt transition along a waveform. (iii) Margin entropy is defined as a state of disorder or decline into disorder of edges waveforms. It is also expected that well-defined margins have lower entropy. (iv) Index of the maximum probability is another measure which is shifted to be zero on the margin, positive for outside, and negative for inside of the margin. This index is used to capture variations of most probable edge places among the margin. (v) A waveform with length is placed sequentially while traversing the margin, and accordingly, an edge probability vector (EP) is computed for each waveform . Then, the features “kurtosis,” “entropy,” and “the index of the maximum probability” are calculated from EP for each waveform . However, as we set the number of waveforms to 32 in our experiments, statistical functions such as variance, kurtosis, minimum, maximum, and average are used to reduce the number of margin features.
Density
Angular second moment “skewness” Contrast “minimum” Contrast “average” Contrast “variance” Contrast “standard deviation” Contrast “skewness” Correlation “variance” Correlation “standard deviation” Correlation “skewness” Variance (sum of squares) “average” Variance (sum of squares) “variance” Variance (sum of squares) “skewness” Inverse difference moment “variance” Sum average “maximum” Sum average “standard deviation” Sum variance “maximum” Sum entropy “maximum” Sum entropy “kurtosis” Entropy “minimum” Difference variance “variance” Difference entropy “maximum” Difference entropy “average” Difference entropy “variance” Information measure of correlation 1 “minimum” Information measure of correlation 1 “standard deviation” Information measure of correlation 1 “skewness” Information measure of correlation 2 “minimum” Information measure of correlation 2 “maximum” Maximal correlation coefficient “standard deviation” Maximal correlation coefficient “kurtosis”
Is a measure of the asymmetry of the “angular second moment feature” distribution about its mean. Is the smallest observation of “contrast features.” Is the statistical value that describes the center of a set of “contrast features.” It measures how far a set of “contrast features” are spread out from their average value. Is a measure of the amount of variation or dispersion of a set of “contrast features.” Is a measure of the asymmetry of the “contrast features” distribution about its mean. It measures how far a set of “correlation features” is spread out from their average value. Is a measure of the amount of variation or dispersion of a set of “correlation features.” Is a measure of the asymmetry of the “correlation features” distribution about its mean. Is the statistic value which describes the center of a set of “sum of square features.” It measures how far a set of “sum of square features” is spread out from their average value. Is a measure of the asymmetry of the “sum of square features” distribution about its mean. It measures how far a set of “inverse difference moment features” is spread out from their average value. Is the greatest observation of “sum average features.” Is a measure of the amount of variation or dispersion of a set of “sum average features.” Is the greatest observation of “sum variance features.” Is the greatest observation of “sum entropy features.” Is a measure of whether the “sum entropy features” are heavy-tailed or light-tailed relative to a normal distribution. Is the smallest observation of “entropy features.” It measures how far a set of “difference variance features” is spread out from their average value. Is the greatest observation of “difference entropy features.” Is the statistical value that describes the center of a set of “difference entropy features.” It measures how far a set of “difference entropy features” is spread out from their average value. Is the smallest observation of the “information measure of correlation 1 features.” Is a measure of the amount of variation or dispersion of a set of “information measure of correlation 1 features.” Is a measure of the asymmetry of the “information measure of correlation 1 feature” distribution about its mean. Is the smallest observation of the “information measure of correlation 2 features.” Is the greatest observation of the “information measure of correlation 2 features.” Is a measure of the amount of variation or dispersion of a set of “maximal correlation coefficient features.” Is a measure of the combined weight of the “maximal correlation coefficient feature” distribution’s tails relative to the center of the distribution.
Additional features
Mass size Patient age
(i) The mass size is represented by the zone included inside the contour and is computed in mm2. (ii) The patient’s age is also used and is the unique human feature used here.