Abstract

As Chinese characters image often has insufficient photography illumination, or underlines under characters, or low resolution, a joint method of interference suppression and super-resolution for Chinese characters image with interference is proposed. In the stage of interference suppression preprocessing, the technology of image layer separation is used to decompose Chinese characters image into the illumination layer and reflectance layer at first, and reflectance layer that contains the essential property of input image is retained consequently. Then the reflectance layer is decomposed into four coefficient subimages by wavelet transform, and image smoothing via L0 gradient minimization with different scale factors is adopted to these different coefficient subimages. Subsequently, by a simple background processing and image filtering, the image preprocessing for Chinese characters image with interference is ultimately completed. In the stage of image super-resolution, due to acquisition limitation of a large number of high-resolution Chinese characters images, we adopt neighbor embedding super-resolution method for its advantage of greatly reducing the scale of training set. In the key feature vector of Chinese characters image, the weights of horizontal and vertical stroke features are strengthened; meanwhile, the other stroke features of Chinese characters are also considered. Ultimately, a super-resolution method more suitable for Chinese characters image is proposed. Experimental results show that our method of interference suppression has the superiority for Chinese characters images with aforementioned interference, and our optimized super-resolution method has better performance for test Chinese characters images than bicubic interpolation method and three classical super-resolution methods.

1. Introduction

In information era, electronic text has become one of the most important high-speed carriers for human to transmit and receive information. Chinese character with profound cultural connotation is one of the earliest words in the world and has been inherited for thousands of years. Chinese characters image as a special type of image has a broad application in the field of image and information processing. However, in many cases, Chinese characters image is not always obtained under ideal photography conditions, such as insufficient illumination, which may result in interference for Chinese characters image. Moreover, underlines are always added for the sake of neatness in the process of writing Chinese characters, which also contributes to interference for later automatic image processes such as image analysis and image understanding.

In practical applications, clear Chinese characters image with high resolution is very necessary for the human or intelligent machine to recognize and understand the information conveyed by Chinese characters image. Due to the technical restriction of imaging and scanning devices, the resolution of Chinese characters image is usually limited. Generally speaking, the missing high-frequency information in low-resolution image cannot be effectively recovered by some simple methods. In other words, although the resolution of Chinese characters image can be improved by these methods, its visual clarity is far from that of the original image. Therefore, it is highly demanded to improve the resolution of Chinese characters image while maintaining the visual clarity of improved image to the maximum extent [1].

In the field of image super-resolution, the super-resolution methods can be broadly divided into three types: interpolation-based method, reconstruction-based method, and learning-based method [17].

In interpolation-based method, the luminance values of the pixels to be interpolated are calculated using the luminance values of their neighboring pixels. Interpolation-based method is easy to implement; however, its ability to estimate lost high-frequency information is limited. In reconstruction-based method, a high-resolution image can be generated based on multiple low-resolution images. This method requires a powerful priori information that is regarded as additional information; otherwise it is difficult to achieve better detail reconstruction [8].

In recent years, learning-based method has become a mainstream because of its unique advantages. This type of method can predict and recover the missing high-frequency information in a single low-resolution image by constructing the training set and learning the relationship between high- and low-resolution image patches using the machine learning algorithms, or by using the information of training set directly. As a result, it can improve the image resolution while maintaining its visual clarity as close as possible to that of the original image.

One representative of learning-based method is sparse representation super-resolution method proposed by Jianchao Yang et al. [1, 9]. This method is based on the compressed sensing theory, and two dictionaries corresponding to high- and low-resolution image patches can be obtained through training in the training set. Assuming that the sparse coefficients of high- and low-resolution image patches under the identical mode are the same, the high-resolution image patch can be obtained using high-resolution image patch dictionary and the sparse coefficients in the corresponding low-resolution image patches. Compared to the previous methods, the dictionary pair by learning is a more compact representation of image patches, which greatly reduces the whole computational complexity. This method can be used for the super-resolution of natural or human face image. Nevertheless, the basic assumption of this method, i.e., the sparse representation coefficients of high- and low-resolution patches under the identical mode are the same, is not always satisfied.

Another representative of learning-based method is neighbor embedding super-resolution method proposed by Hong Chang et al. [6], which makes full use of locally linear embedding ideology in the manifold learning and assumes that the manifolds composed of high- and low-resolution image patches, respectively, have similar local structure. According to this assumption and the constraint of error minimization, the neighbors for test low-resolution image patch in the training set can be selected, and the corresponding linear combination coefficients can be obtained. Note that these calculations are based on the typical features of image patches. As a result, test high-resolution image patch can be obtained by these linear combination coefficients and high-resolution neighbors corresponding to the aforementioned selected low-resolution neighbors.

Except for the aforementioned classical methods, the studies on image super-resolution are roughly divided into two trends in recent years.

One trend is deep learning method driven by big data. For example, Chao Dong et al. propose a new super-resolution depth learning method for single image. The proposed SRCNN (Super-Resolution Convolutional Neural Network) method learns an end-to-end mapping between high- and low-resolution images, with little extra pre-/postprocessing beyond the optimization [10]. Moreover, a DRRN (Deep Recursive Residual Network) for single image super-resolution is proposed by Ying Tai et al. In DRRN, an enhanced residual unit structure is recursively learned in a recursive block, and several recursive blocks are stacked to learn the residual image between high- and low-resolution images. Finally, the residual image is added to the input low-resolution image from a global identity branch to estimate the corresponding high-resolution image [11]. In addition, Wenzhe Shi et al. report a new subpixel convolution layer that has the ability to super-resolve low-resolution data into high-resolution space with very little additional computational cost [12].

The other trend is the conventional method driven by algorithm. For example, in order to improve the reconstruction performance of the existing super-resolution methods, De-tian Huang et al. propose a new super-resolution method based on sparse representation using regularization technique and guided filter [8]. Besides, a novel image super-resolution method called LANR-NLM (Locally Regularized Anchored Neighborhood Regression with Nonlocal Means) is proposed by Junjun Jiang et al. It uses locality constraint to select similar dictionary atoms and assigns different freedom to each dictionary atom based on its correlation to the input low-resolution patch [13]. Additionally, a self-similarity based image super-resolution method using transformed self-exemplars is proposed by Jia-Bin Huang et al., in which a factored patch transformation representation is employed to account for both planar perspective distortion and affine shape deformation of image patches simultaneously. They also exploit the 3D scene geometry and patch search space expansion for strengthening the self-exemplar search [14].

In practice, it has some limitations to obtain a large number of high-resolution Chinese characters images. Therefore, we have not adopted the big data based method above. Because of the great similarity of Chinese characters, in order to take advantage of this similarity to achieve better super-resolution effect, we have not adopted the aforementioned method without training set, but adopted and optimized neighbor embedding super-resolution method to make it more suitable for super-resolution of Chinese characters image, which is based on small-scale training set.

Based on several existing theories and methods, we integrate and optimize them to achieve the interference suppression and image super-resolution for Chinese characters image with interference in this paper. More specifically, the following theories and methods are mainly used: image layer separation based on relative smoothness, image smoothing via gradient minimization, and neighbor embedding super-resolution method. Note that Chinese characters images in this paper all have black characters and white background.

Image layer separation based on relative smoothness is proposed by Yu Li et al. [15], which can extract two layers from a single image with one layer smoother than the other. Assume that the input image is the product of reflectance layer and illumination layer at each pixel. Therefore, can be expressed by

Image smoothing via gradient minimization is proposed by Li Xu et al. [17]. Assume that the input image is expressed by and the smoothed image by . The smoothed image has a gradient at any pixel , where , are the operators of calculating gradient of pixel at the horizontal and vertical directions, respectively. # is the counting operator, outputting the number of that satisfies , and then the gradient counter can be expressed by

The smoothed image can be further estimated by the following minimization function:where is the data fidelity item that is used to ensure the similarity of input image and smoothed image at any pixel and is the filter parameter.

Neighbor embedding super-resolution method is proposed by Hong Chang et al. [6]. Note that it is called classical neighbor embedding super-resolution method or super-resolution method of classical neighbor embedding in this paper. It assumes that the manifolds composed of paired high- and low-resolution image patches are similar in local structure. Therefore, the neighbor relationship of each pixel in the low-resolution data can be mapped to the corresponding pixel in the high-resolution one. According to the structure features of Chinese characters, we have optimized the classical neighbor embedding super-resolution method to make it more suitable for Chinese characters image.

3. Proposed Method in This Paper

The overall flow chart of proposed joint method is illustrated in Figure 1, and the specific steps are described as follows.

Step 1 (illumination layer removal from Chinese characters image with interference). For input Chinese characters image with dark background, the layer separation method [15] is used to achieve the illumination layer removal from the input image, and the reflectance layer that contains the essential property of input image is retained. For input Chinese characters image with underlines, the same separation method is adopted to remove the illumination layer when the separation parameter is adjusted appropriately. As a result, it can also partially remove the underlines whose color is lighter than that of characters to a certain extent.

Step 2 (wavelet decomposition for reflectance layer and image smoothing for coefficient subimages). The wavelet decomposition is used for reflectance layer obtained in Step 1, and then one low-frequency coefficient subimage and three detail coefficient subimages of horizontal, vertical, and diagonal directions can be obtained [16]. Since one low-frequency coefficient subimage and three detail coefficient subimages retain the different image structure features, respectively, image smoothing via gradient minimization [17] with different scale factors is adopted for each coefficient subimage. For Chinese characters image with dark background, its low-frequency coefficient subimage is smoothed strongly while its detail coefficient subimages are smoothed weakly. For Chinese characters image with underlines, its low-frequency coefficient subimage is smoothed moderately, and its detail coefficient subimage of horizontal direction is smoothed strongly, and its detail coefficient subimages of other directions are smoothed weakly. Finally the wavelet synthesis is used to detail coefficient subimages after smoothing [16].
The above-mentioned image smoothing can filter out most of the interference in Chinese characters image; that is, the most of noise interference in Chinese characters image with dark background can be filtered out, and the most of underlines interference in Chinese characters image with underlines can also be filtered out, whose color is lighter than that of characters to a certain extent.

Step 3 (background processing and filtering for the image after wavelet synthesis). Here we first convert the image after wavelet synthesis into its gray-scale version. As the background values of Chinese characters image after the above-mentioned processing may change by different degrees, a simple background processing, i.e., setting the pixel values of image background to 1 (the maximum value after pixel value normalization), is performed by selecting the appropriate threshold value. In addition, there may still exist the noise interference in Chinese characters image, so a simple median filter is used to filter out it again. All the steps above finally achieve interference suppression preprocessing for Chinese characters image with interference before image super-resolution.

Step 4 (optimized neighbor embedding super-resolution method for Chinese characters image). Referring to classical neighbor embedding super-resolution method proposed by Hong Chang et al. [6] and our previous study [18], considering the particularity of Chinese characters image, we make optimizations for neighbor embedding super-resolution method in the phase of training and test.
In the aspect of composing key feature vector of low-resolution image patch, we not only take full account of the structure features of Chinese characters and increase the weights of horizontal and vertical stroke features, but also take full consideration to other stroke features that are difficult to standardize. Therefore, we use gray-scale Chinese characters image itself as one key feature, combined with the four image features generated by the gradient operators of the first-order and second-order of horizontal and vertical directions, respectively, as shown in (4), to generate the feature vector of low-resolution image patch ultimately [1, 9]. where , are the transpositions of and , respectively.
Referring to our previous study [18], the low-rank matrix recovery and the grouping strategy of training and test image patches are also adopted in the super-resolution process to further improve its effect. The low-rank matrix recovery is proposed by Emmanuel J. Candès. et al., which decomposes the input matrix into low-rank matrix and sparse matrix; i.e., [19]. In this way the sparse matrix that contains the part of noise and illumination interference can be removed to a certain extent.
In the specific code execution, some other optimizations for our neighbor embedding super-resolution method are adopted in the phase of training and test. For example, referring to sparse representation super-resolution method [1, 9], we perform the random sampling for training images with predetermined patch number to speed up the entire training process. In addition, some image patches with small feature variances are discarded during training process. Due to the length limit, the other details of this specific optimized code are not shown here.

4. Experimental Results

In this experimental section, we verify the effectiveness of proposed interference suppression method for Chinese characters image with interference and verify the superiority of our super-resolution method for Chinese characters image compared to super-resolution methods of classical neighbor embedding [6], sparse representation [1, 9], and transformed self-exemplars [14]. The experimental section is divided into two parts. The first part is the preprocessing experiment on Chinese characters image with dark background or underlines. The second part is the super-resolution experiment on Chinese characters image after interference suppression.

The experiment is performed using the software Matlab R2014b, which is installed on a computer with an Intel Core i7-4790 processor. In this paper, Chinese characters images with dark background or underlines, as well as the training set of high-resolution Chinese characters images, are generated by shooting in the low light environment or scanning by scanner after printing the electronic version into paper version.

For Chinese characters image with dark background, the main parameters of the preprocessing experiment are as follows: The parameter of illumination layer separation is 80 and smoothing scale factors for three detail coefficient subimages are all 1e-3, while that for low-frequency coefficient subimage is 1e-1. For Chinese characters image with underlines, the main parameters of preprocessing experiment are as follows: The parameter of illumination layer separation is 5 and smoothing scale factors for three detail coefficient subimages of horizontal, vertical, and diagonal directions are 1e-1, 1e-3, and 1e-3, respectively, while that for low-frequency coefficient subimage is 5e-2.

As can be seen from the preprocessing experimental results of Chinese characters images with dark background, as illustrated in Figures 2 and 3, our preprocessing method can effectively improve the illumination level of Chinese characters image. For Chinese characters image with underlines, the preprocessing can remove the underlines to the maximum degree, as illustrated in Figures 4 and 5. So the interference suppression preprocessing can provide better material for subsequent image super-resolution.

The main parameters in our super-resolution method are as follows: The image resolution is magnified by 3 times, and the patch size of low-resolution image is 3×3 with 1-pixel overlap between patches. The number of nearest neighbors searched for test image patch is 5. Meanwhile, referring to sparse representation super-resolution method [1, 9] and our previous study [18], the original low-resolution image can be upsampled twice and divided into 6×6 patches to participate in the computation, which have 2-pixel overlap between patches. The main selectable parameters of super-resolution methods of classical neighbor embedding [6], sparse representation [1, 9], and transformed self-exemplars [14] are basically the same as that of our super-resolution method. For example, to make it perform better for sparse representation super-resolution, its dictionary size is set to 1024, and the number of patches to sample is 1000000. The uniform training set used in super-resolution experiment is only one large training image for the super-resolution methods of classical neighbor embedding, sparse representation, and ours.

In the specific code execution, we modify the original codes of classical neighbor embedding, sparse representation, and transformed self-exemplars to amplify the resolution of input image after interference suppression directly, instead of downsampling the input image and then amplifying their resolution in their original codes. Moreover, some of them are only for color image, whereas the image after interference suppression is gray image; thus we slightly modify their original codes to make them capable of processing the gray image.

In order to more quickly and clearly compare the super-resolution results by different methods, we only select one Chinese character in each image after preprocessing as the test material. Super-resolution experimental results of Chinese characters images with dark background after preprocessing are illustrated in Figures 6 and 7. Figures 8 and 9 illustrate the super-resolution experimental results of Chinese characters images with underlines after preprocessing. As can be seen from the super-resolution results of Chinese characters images, for test Chinese characters images, our super-resolution method is superior to bicubic interpolation and super-resolution methods of classical neighbor embedding [6], sparse representation [1, 9], and transformed self-exemplars [14], especially in the edge contour of Chinese characters. Note that, restricted by document layout, the sizes of all images in the experimental section have been decreased by a certain percentage for display.

5. Conclusion

In this paper we propose a joint method of interference suppression and super-resolution for Chinese characters image with interference. This interference suppression method shows desirable effect on Chinese characters image with dark background or typesetting underlines. Firstly, the reflectance layer is extracted from input image, and the wavelet decomposition is used for this reflectance layer. Afterwards, according to different interference types and different coefficient subimages, the coefficient subimages after wavelet decomposition are smoothed via gradient minimization with different scale factors. Then by using a simple background processing and filtering strategy, the interference suppression preprocessing is completed ultimately. In the phase of image super-resolution, considering the structure features of Chinese characters, the weights of horizontal and vertical strokes are increased, and other strokes, e.g., skimming and napping strokes, are also considered for crucial grouping and matching of image patches, and then after some other optimizations, the image super-resolution method more suitable for Chinese characters image is achieved ultimately.

The proposed joint method can effectively suppress the interference of dark background or underlines in Chinese characters image and also has better super-resolution effect for Chinese characters image. However, the proposed joint method still has some limitations when Chinese characters image has both dark background and underlines simultaneously. Meanwhile, the proposed method of interference suppression is mainly for the underlines whose color is lighter than that of characters to a certain extent, and they cannot be effectively removed when the color of underlines is close to that of characters.

In addition, unlike many other super-resolution studies which carry out the verification experiment with known original high-resolution image, because the original high-resolution image is mostly unknown in practice, we first perform interpolation amplification for the image after interference suppression and then use optimized neighbor embedding super-resolution method to improve the visual clarity of interpolated image. Because we lack the original high-resolution image as a benchmark, a variety of objective image evaluation methods cannot be used, and we only carry out the subjective evaluation for the super-resolution results by different methods in this paper consequently.

Besides, there are some parameters in our preprocessing that need to be adjusted manually. This is because the imaging illumination condition generally has a great influence on subsequent image processing, and different degrees of weak illumination generally give rise to different interference on Chinese characters image. Meanwhile, Chinese characters image with different strokes, sizes, and colors has different influence on the processing of underlines removal. It is difficult to use a fixed set of parameters to process the Chinese characters image with various degrees of dark background or underlines, so our work is unable to achieve adaptive adjustment of these parameters. We leave these limitations for our future research.

Data Availability

The data of experimental images used to support the findings of this study are available from the corresponding author upon the reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by National Natural Science Foundation of China (Nos. 41601353, 61501372, and 61801384), Foundation of State Key Laboratory of Transient Optics and Photonics, Chinese Academy of Sciences (No. SKLST201614), Young Elite Scientists Sponsorship Program by CAST (No. 2017QNRC001), Natural Science Basic Research Plan in Shaanxi Province of China (No. 2017JQ4003), Scientific Research Project of Shaanxi Provincial Education Department (No. 17JK0766), and Science Research Fund of Northwest University (No. 15NW30).