Abstract
The 3D reconstruction technique using the straight-line segments as features has high precision and low computational cost. The method is especially suitable for large-scale urban datasets. However, the line matching step in the existing method has a mismatching problem. The two main reasons for this problem are the line detection result is not located at the true edge of the image and there is no consistency check of the matching pair. In order to solve this problem, a linear correction and matching method for 3D reconstruction of target line structure is proposed in this paper. Firstly, the edge features of the image are extracted to obtain a binarized edge map. Then, the extended gradient map is calculated using the edge map and the gradient to establish the gradient gravitational map. Secondly, the straight-line detection method is used to extract all the linear features used for the 3D reconstruction image, and the linear position is corrected by the gradient gravitational map. Finally, the point feature matching result is used to calculate the polar line, and the line matching results of the adjacent three images are used to determine the final partial check feature area. Then, random sampling is used to obtain the feature similarity check line matching result in the small neighborhood. The aforementioned steps can eliminate the mismatched lines. The experimental results demonstrate that the 3D model obtained using the proposed method has higher integrity and accuracy than the existing methods.
1. Introduction
Using the camera imaging model to recover the 3D structure of the object from the acquired 2D image sequence is one of the classic problems in the field of computer vision. The 3D reconstruction refers to the establishment of mathematical models suitable for computer representation and processing of 3D objects. The 3D reconstruction is the basis for processing, manipulating, and analyzing the properties of 3D objects in a computer environment. It is also a key technology for establishing virtual reality that expresses the objective world in a computer. How to make computers perceive the 3D environmental information has always been one of the goals in the field of computer vision. The development of computer vision and deep learning has provided significant enhancement in fields such as autonomous driving, biometrics, video recognition, and drones. However, if these areas want to further improve, 3D reconstruction may be a good breakthrough.
The existing 3D reconstruction technology generally uses structure-from-motion (SfM) [1] approaches and multiview stereo (MVS) [2] pipelines (e.g., PMVS [3]or SURE [4]). The former can obtain the sparse point cloud model of the scene and the camera pose information and apply it to MVS to develop a 3D dense point cloud model. However, because the feature point dataset is very large, the MVS algorithm has a slow processing speed, which often takes a large amount of time and computing memory. In addition, viewing in the point cloud viewer has become extremely difficult. Moreover, the image-based 3D reconstruction technology is affected by factors such as lighting and occlusion when extracting feature points. At the same time, it is sensitive to the accuracy of feature point matching and the accuracy of camera correction when calculating the camera projection matrix and solving space points. Correctly extracting and matching the feature points and accurately solving the 3D geometric relationships have always been difficult problems in the field of computer vision. Therefore, more complex geometric primitives can be selected as the data representation, such as planes (e.g., [5–7]) or lines (e.g., [8–10]). By analyzing the pinhole camera model, epipolar geometry, and various line segment detection algorithms, it is found that 3D reconstruction based on line matching is feasible. In addition, the surrounding artificial buildings have prominent line segment geometric features. If the relevant 3D information is extracted and matched, the 3D reconstruction efficiency can be enhanced.
The unique 3D information can be used to extract and match the related 3D information using the unique geometrical features of the line segments. These line segments can be obtained by any line segment detector. The two most commonly used line detection algorithms are LSD [11] and EDL [12]. Both algorithms can provide accurate and almost abnormal detection in a very effective way. Figure 1 shows an example image with line segments obtained using the LSD algorithm.

(a)

(b)
The literature shows that the 3D reconstruction technology has evolved from a point-based motion recovery structure algorithm to a line-based multiview stereo vision algorithm, but each algorithm has its own advantages and disadvantages. Currently, how to obtain a high-precision 3D scene model is also the focus of 3D reconstruction research. Since the structure of the scene is usually complicated and of large scale, obtaining a high-precision 3D model is still a problem that deserves attention and requires significant amount of resources, energy investment, and technical research.
2. Related Work
The point-based motion recovery structure algorithm relies heavily on the unique textures in the scene and appears weak when facing some monotonous environments. Although, the SFM algorithm strives to create sufficient feature matches in order to successfully calculate the correct camera pose, the 3D models generated are usually very sparse. Since the linear characteristics of the artificial building environment are very obvious and the line segment is the most common geometric feature in the artificial building environment, it would be a good choice to complete the feature extraction and matching using the straight-line segment feature.
Bay et al. [13] used line segments from two uncalibrated images to determine the relative camera poses and to compute a piecewise planar 3D model. However, this method is not suitable for processing more than two images, is not robust when dealing with unstable lighting conditions, and is unable to handle some outdoor scenes.
Further, Schindleret et al. [14] incorporated the Manhattan-world assumption into the reconstruction procedure to decrease the computational complexity and to reconstruct buildings from two views.
In 2010, Jain et al. [15] proposed a method of reconstructing lines from multiple different stereo images. The method does not require the correspondence of line segments in different images. The method independently reconstructs the line segments using connectivity constraints and then calculates the final 3D model by merging. Although this method has achieved good visualization, it is not suitable for large-scale datasets.
In 2014, Micusik and Wildenauer [16] proposed a SLAM-like system with line matching through narrow baselines and showed impressive results, especially for indoor scenes. However, this method only attempts to estimate the camera pose estimation and 3D reconstruction through line segments is extremely difficult.
Hofer et al. [17] proposed a public line-based 3D reconstruction tool, called Line3D++. The method first establishes a large set of potential line correspondences between images through weak epipolar constraints and uses a scoring formulation based on mutual support to separate correct matches from incorrect matches for each segment. The final line-based 3D model is obtained by clustering the 2D segments from different views using effective graph-clustering formulas. However, the 3D reconstruction results of Line3D++ method will lose part of the line structure mainly because the line detection result of the line matching step is not located at the true edge of the image, and there is no consistency check of the matching line pair.
In order to solve this problem, this paper first corrects the LSD line detection results produced by Line3D++ and then uses the epipolar constraint principle to eliminate the mismatched lines. Finally, an accurate and complete 3D reconstruction result is obtained.
3. Correct Line Position
Let the image be . Use the Canny operator [18] to perform edge detection on and obtain the edge map . Solve the gradient map by , using the form of first-order difference:where are the coordinates of the pixel, while and are the first-order partial derivatives of and directions, respectively.
Figure 2 shows the construction process of the extended gradient map, where Figure 2(a) shows the edge graph and Figure 2(b) shows the result of Figure 2(a) solved by the gradient formula (1). The gradient direction of the surrounding pixels of the edge pixels is cleared and then all of the surrounding pixels are regarded as the edge pixels. At this time, the gradient is calculated again as shown in Figure 2(c). The above process is repeated until all the pixels are traversed, and the final result is obtained as shown in Figure 2(d).

(a)

(b)

(c)

(d)
The gradient gravitational maps are created by extending the gradient map to correct the line detection results. Figure 3 shows the process of constructing the gradient gravitational map. Let matrix be an extended gradient map generated from the edge map. Figure 3(a) shows the representation of extended gradient map.

(a)

(b)

(c)
First, create a matrix based on the red region in matrix and assign , , , , and to the elements located at positions , , , , and , respectively. Then, traverse the green area in matrix , fill in the corresponding positions in with the same element values as the pointing positions, and then calculate the purple area in similar manner. Note that if there are edge pixels around it but it points to other pixels, such as , the circle is drawn around , and the edge position at which it intersects first is . At this time, is assigned the same element value as . The process is repeated until all the elements in are traversed and matrix is filled, as shown in Figure 3(b). At this time, each element in matrix is assigned, thereby obtaining a gradient gravitation diagram. Figure 3(c) shows the representation of matrix .
The line detected by the line detection algorithm usually has a certain deviation from the actual edge. Figure 4 shows an example of linear correction problem. Figure 4(a) shows the line detection result, where the yellow dotted lines and the red solid lines represent the actual edges and the detected lines, respectively. Figure 4(b) shows the corresponding gradient gravitational map of Figure 4(a). Let the point on the line and the distance from the point to the line be and , respectively. Correcting the linear position by the gradient gravitational map may have the following two problems:(1)The first problem exists in the local area of the corner. If , the correction result is and . In this case, the gradient gravitational map correction will obtain the incorrect correction result. In fact, should satisfy.(2)The second problem is at the abutting edge. If , the correction result is and . In this case, incorrect correction result will be obtained by the gradient gravitational map correction. In fact, should satisfy.

(a)

(b)
In order to address the aforementioned problems, this paper proposes the following straight-line correction method.
In Figure 5, the blue line is the correct edge position. Take the yellow straight-line segment as an example. The two endpoints of yellow line are and . The yellow line is divided into equal parts, and the equal points are recorded as . The corrected positions and of the endpoints and , respectively, and the corrected positions of the bisector are calculated by the gradient gravitational map.

Let be the slope of the line segment. Calculate the slope , , of each small line segment after correction. Then, sort the sequence in ascending order to obtain the new sequence . Take out the middle consecutive slopes , , , and calculate the standard deviation aswhere .
Set the threshold ; if , then add , , from left to right to the aforementioned consecutive slopes and calculate a new standard deviation for each addition and check the condition . If condition is true, continue to join the next one and repeat the above operation. Otherwise, stop joining and remove the current added slope. Perform same operations on , , .
At this time, the line segment that holds the slope is saved with a total of . The total number of endpoints is , and the connectivity is determined by the number of occurrences of the same endpoint. Let denote the number of occurrences of endpoint . If there are two endpoints satisfying and there are endpoints satisfying , then all the saved small line segments are connected, and the two points satisfying are taken as new endpoint pairs. It is extended to maintain the same length as the original straight-line segment.
The extension method is as follows.
Assume that the distance between the two ends p and q of the line segment in Figure 6 is shortened by and , respectively, and and are used as the starting nodes. Then, the extended line vector is calculated aswhere is a calculation vector function, while and are determined to obtain a new pair of endpoints of the corrected straight-line segment.

The line matching results are refined. The polar line in the adjacent three images is calculated using the point feature matching result. Then, the matching results are combined to determine the final verified local feature area and random sampling is used to verify the feature similarity in the small neighborhood. Thus, the incorrect matching line features are eliminated. If the matching line exists only in two adjacent images, the above method is only performed in two adjacent images. Figure 7 shows the process of determining small neighborhoods by combining epipolar lines.

(a)

(b)

(c)
In Figure 7, take three adjacent images , , and as examples. The blue lines are the polar lines, while the yellow and the black lines represent one of the polar lines and the corresponding matching line, respectively. In this paper, the polar and the corresponding matching lines are used as reference to obtain the small neighborhood , , , , , , , , , around the line, that is, the areas surrounded by the green lines. The specific solution method is as follows.
Firstly, the bisection points are obtained using the bisection of line segment . Then, according to epipolar constraint, the corresponding polar lines of in Figure 7(a) and 7(c) are obtained, and the corresponding points and are obtained according to the intersection of the polar and ,. The process determines the corresponding points on the matching lines , , and and calculates the similarity of the local regions of the lines by determining the small neighborhood features of , , and , respectively. In order to speed up the calculation, the regional similarity can be calculated by random sampling.
Secondly, the size of each small neighborhood is determined. In the experiment, let the radius threshold be , and save the pixels smaller than the radius threshold in a new image . Finally, the gradient direction determines whether the neighborhood is the left or the right neighborhood to which the pixel belonging to is pointing. In order to maintain the consistency of the direction, the area pointed by the straight-line gradient represents the right neighborhood and the other side is the left neighborhood.
The similarity of the line neighborhood is determined by calculating the similarity of the pixel colors within the region. Let the corresponding matching areas of , , and be , , and , respectively. The number of pixels in areas , , and is , , and , respectively. The neighborhood similarities of -, -, and - are calculated. The neighborhood similarity between and is
The neighborhood similarities between - and - are calculated in similar manner as equation (4).
The calculation is performed using the above method. If the corresponding regions in the three images are randomly selected to have high similarity, it is determined that the matching straight line is correct. In this way, the straight line is refined, and all three adjacent images in the dataset are refined in order to obtain the line matching results.
Construction of gradient gravitational map is shown in Algorithm 1.
|
The method of refining the line matching result is shown in Algorithm 2.
|
4. Experiments
For the experiments, 3.2 GHz CPU, 16G RAM, and a Nvidia GeForce GTX 1060 6 GB GPU were used. The proposed algorithm was implemented in C++ and (optionally) also in CUDA. The Herz-Jesu-P25, Castle-P10 from [19], and Brickstone were used as ground truth test sequences.
In the experiment, the local neighborhood radius was set to 5 pixel values, the number of samples was of all small neighborhoods, the similarity threshold was set to 10, and the proportional threshold was set to .
In this paper, the effectiveness of the proposed method is illustrated by three parts of comparative experiments. The first shows results of line matching and purification. The second compares the sparse 3D point cloud model with the experimental results of the proposed method. The third part compares the results of Line3D++ with the experimental results of the proposed method. Figures 8–13 show the experiment results.

(a)

(b)

(c)

(a)

(b)

(c)

(a)

(b)

(a)

(b)


Figures 8 and 9 show results of line matching and purification. Figures 8(a) and 9(a) show the direct match results with LSD. Figures 8(b) and 9(b) show the results of the straight line correction. Figures 8(c) and 9(c) show the results of the purification matching.
In Figure 8, Figure 8(b) eliminates lines 17, 20, 18, 13, 15, 27, 9, and 21 in the matching result compared with Figure 8(a). While compared with Figure 8(b), Figure 8(c) eliminates lines 20, 1, and 13. The matching error in Figure 8(a) of this set of experiments is not obvious. For example, although lines 20 and 18 are on the same side plane, they are located in the middle of the side plane in row 1 and at the edge of the side plane in row 2 of Figure 8(a). This situation can easily lead to incorrect matching results. The main reason is that the line of the edge position detected by the LSD is offset to the middle of the side plane. However, the proposed method can solve this problem.
In Figure 9, Figure 9(b) eliminates lines 23, 5, 4, 3, 1, 8, and 26 in the matching result compared with Figure 9(a). Compared with Figure 9(b), Figure 9(c) removes line 1. Among all the matched line pairs that are removed, the line pairs with obvious errors are lines 2 and 26. Since they are not located at the true edges of the image, matching errors are caused. When the corrected result is used as an input for matching, a matching result with higher accuracy can be obtained. In addition, lines 1 and 5 in Figure 9(a) are obviously located at different edges. After being processed by the method proposed in this paper, this kind of error can be effectively reduced.
Figures 10 and 11 show a comparison between the sparse and the line-based 3D models. Figures 10 and 11 show the sparse 3D point cloud models of the scene obtained by processing the image set using the SFM algorithm and the line-based 3D reconstruction results, respectively.
It can be seen from the comparison that the 3D sparse point cloud model is able to represent the characteristics of the building, but the overall structure is not very obvious. On the other hand, although the 3D line segment model is very vague at some curve features, the lines of the building are very obvious. Using the 3D line segment model can more prominently represent the geometric structure of the building, especially for buildings with straight lines and few curves. In addition, the 3D point cloud model has poor reconstruction accuracy in the absence of textures, and the obtained 3D models often appear hollow. The 3D line segment model can provide more sufficient structural information and better reflect the geometric topology of the scene compared with the 3D point cloud model. Thus, the 3D line segment model provides a highly meaningful semantic 3D information for reconstruction.
Figures 12 and 13 compare the Line3D ++ results with the experimental results of the proposed method. Figures 12(a), 12(b), 13(a), and 13(b) show the results of Line3D++ and the experimental results of the proposed method, respectively. Figures 12(c), 12(e), 13(c), and 13(e) show the two enlarged local areas in Figures 12(a) and 13(a). Figures 12(d), 12(f), 13(d), and 13(f) show the corresponding local enlarged areas in Figures 12(b) and 13(b).
It can be seen from the figures that the 3D reconstruction results of Line3D++ lack many lines. Compared with Line3D++, the proposed improved method has more lines and more local structures, which significantly improve the integrity of the object. The experimental results can be analyzed from different angles and different structures. For example, Figure 12(c) shows the position of the door, and the reconstructed line segment is very sparse. The line segment reconstructed in Figure 12(d) is relatively dense, and many important lines are restored, making the structure of the door clearly visible.
Figure 12(e) shows the position of the brick wall, but most of the reconstructed lines are vertical lines. There are only few horizontal lines, and it is impossible to see what structure is reconstructed. In contrast, Figure 12(f) restores relatively large number of horizontal lines, making the line segment and the outline of the brick clearer.
Figure 13(c) shows a partial enlargement of the stepped portion. It can be seen that only a few lines have been reconstructed, while Figure 13(d) shows that the step is reconstructed properly due to the extremely high degree of reconstruction integrity of the proposed method. Figure 13(e) shows a partial area of a part of the wall. Obviously, Figure 13(f) has more lines and better results.
Table 1 shows the number of feature lines of the Line3D ++ and the method proposed in this paper for two datasets. For the Castle dataset, the number of feature lines in Line3D ++ and the proposed method is 1,590 and 1,676, respectively. For the Herz-Jesu dataset, the number of feature lines in Line3D ++ and the proposed method is 1704 and 2394, respectively. The number of feature lines in both datasets increased significantly. Table 2 shows the RMSE of the Line3D ++ and the method proposed in this paper for two datasets. As we can see, the proposed method has a slightly higher accuracy.
Reconstruction of the 3D line segment model was conducted for two sets of classical datasets. By comparing the experimental results and the experimental data in Table 1 and 2 before and after using the proposed method, it can be seen that the proposed method effectively solves the defects of insufficient accuracy and visual effects, for example, there are too many stray lines, and some areas cannot restore the characteristics of buildings. Moreover, the proposed method can improve the matching accuracy, produce detailed model outline, and provide high 3D reconstruction efficiency.
5. Conclusion
This paper presents a linear correction and matching method for 3D reconstruction of target line structure and resolves the mismatching problem in the line matching step in the Line3D++ algorithm. The gradient map is extended to construct the gradient gravitational map in order to correct the position of the straight-line segment detected by the straight-line extraction method. The epipolar constraint is used to eliminate the mismatched straight lines in order to improve the quality of the 3D reconstruction. The experimental results demonstrate and validate that the 3D reconstruction results obtained by the proposed method are more accurate and complete than the Line3D++.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This project was supported financially by the National Natural Science Foundation of China (no. 61601213), Department of Education of Liaoning Province (China) (no. LR2016045), and Liaoning Provincial Natural Science Foundation of China (no. 2019-ZD-0038).