Abstract
This article presents a novel approach to calculate and accelerate the Occupancy Grid Mapping (OGM) for real-time mobile robot navigation. Here, we have improved the main process involved in the construction of OGMs in two ways. On one hand, we propose a new method to perform the camera calibration (CC) process, in which we reduce the number of intermediate steps (homographies) when the reference system is transformed from the real world into a navigation map, reducing consequently the computational costs. On the other hand, the OGM is constructed by an online probabilistic method, which simultaneously performs both the camera calibration and maps construction, thus reducing cumulative errors when these two processes are separately treated. Furthermore, the proposed system can be easily parallelized and mapped onto a digital system (e.g., Field Programmable Gate Arrays, FPGAs) for real-time robot navigation.
1. Introduction
Robot navigation is an important task that allows robots to perform activities such as localization, path planning, collision avoidance [1], and people finding [2]. Here, environment modeling plays a key role and several methods have been proposed, e.g., Occupancy Grid Mapping (OGM) [3], Geometric Mapping (GM) [4], and Topological Mapping (TM) [5], with the former being the predominant one. The OGM method creates maps from environments which have been sensed by either a sonar [3, 6] or a different sensor such as laser range finder (e.g., Laser Imaging Detection and Ranging (LIDAR)) [7–11], ultrasonic [12], radar [13], and vision [14–18], as well as using a combination of them [19–24]. Thereby, any environment can be mapped using OGM-based methods, albeit estimating the localization of a robot based on its position and orientation in a mapped environment with a high-resolution may require high computational costs [25].
Traditional OGM-based methods have some issues, such as a wrong assignation of occupancy probabilities in maps during the sensor readings, specially when its beam is reflected over several surfaces and the beam may return or not. Another problem occurs during the mapping when grid-elements’ probabilities reach a probability of 1 and further sensor’s readings are not capable of changing them [26]. In this work, two major contributions on the main processes associated with OGM-based visual navigation systems; that is, Camera Calibration (CC) and Occupancy Map construction are presented. In the former, we extract accurate metric information [27, 28] from the environment, which serves as input to generate the OGM and navigation system. On the occupancy map construction, spatial representations of the robot environment [2] are created by covering the plane with repetitions of a particular regular polygon (tessellation of the plane) [29] that stores a probabilistic estimation of its state (empty or occupied). To the best of our knowledge, these two processes are usually treated separately; thus, one main contribution of this work is that such processes are simultaneously implemented and performed to build up the proposed OGM-based visual navigation system. The other main contribution of this work is regarding the CC method; there are several CC methods and different classifications (see [30, 31] for comprehensive reviews), where its fundamental operation is the homography, usually represented as a matrix and denoted by [32]. This is an invertible mapping from a projective plane to itself such that three points lie on the same line if and only if their mapped points are also collinear [33, 34]. Herein, the proposed CC method uses only two homographies, fewer than a traditional CC method (four in most of the cases), thus reducing the high computational costs and speeding up the process when the OGM is calculated.
In Figure 1, we graphically highlight the improvements of our approach to perform the CC process compared with a traditional one. In such figure, red and green arrows correspond to homography operations required by our approach and those in traditional ones, respectively. Here, the CC method allows associating points from the real-world’s to grid-elements of the OGM . The traditional CC method, green arrows labeled as TM (Traditional Method) from 1 to 4, calculates the homography transforming the real-world’s reference system into the robot’s reference system in the step TM1; next the step TM2 transforms from robot’s reference system into camera’s reference system by using the homography ; afterwards the step TM3 transforms camera’s reference system into the camera’s plane of the image (converting meters to pixels) by means of homography ; finally, homography, , transforms the camera’s plane of the image into the OGM (see [30] for implementation details). The proposed CC method, red arrows labeled as PM (Proposed Method) from 1 to 2, is as follows: step PM1 transforms from the real-world’s reference system into the robot’s reference system by means of an homography and finally PM2 through a second homography directly transforms from the robot’s reference system into the OGM ; this is achieved by means of a particular experimental configuration of the calibration pattern (chessboard-like proposed in [27]) and the camera sensor, later explained.

The proposed system aims to tackle some of the main problems associated to OGM-based visual navigation systems, such as CC methods being computational intensive and there is a trade-off between the size and the resolution of the grid, so that to fulfill the real time requirements it is common practice to use a simplified inverse sensor model that maps the measurements to the grid [24]; this result affects the precision of the representation [35]. In the proposal, the CC method and OGM construction are both together treated and performed, contrary to most works that commonly process them separately. Besides, the grid-elements of OGM are generated from the calibration pattern which favors to dispense sensor model generation and allows to reduce cumulative errors derived from rounded values of homographies. Besides, the system was designed to be implemented on digital systems such as Field Programmable Gate Arrays (FPGAs) to take advantage of their capabilities to parallelize algorithms.
The rest of the paper is organized as follows: Section 2 presents some related works. Section 3 introduces the proposed OGM-based visual navigation system including its components and methods. Numerical experiments, configuration, and results are shown in Section 4. Finally, in Section 6, conclusions of this work and future work are given.
2. Related Works
Maps are an interface for the visualization of environmental data that allows the access to information and exploratory activities [36]. In this regard, the OGM is a method for modeling both indoor and outdoor environments through a 2D or 3D grid, in which each grid-element stores quantitative information about the occupancy state (empty or occupied) of the environment’s region it covers. The creation of an OGM is not a trivial task, and it requires translation of the data coming from sensors by means of sensory models to make estimations about the status of the environment being modeled [26]. The OGM method was introduced in [3] and it consists of radar sensors and implements two intermediate models to map the probabilities of empty and occupied regions sensed from the environment; these models are unified according to an updating rule where the greater probability value from the same spatial grid-element between two models is chosen to define the occupancy state (unknown, empty, or occupied) of the corresponding grid-element in the unified or global map. Later, in [37] a modification to update the global map was proposed, which is based on Bayes Theorem. However, these two updating rules have numerical issues when probabilities tend to extreme values of 0 (empty) or 1 (occupied) [38]; this issue was overcame by Thrun’s proposal, where the occupancy state of each grid-element is calculated as the logarithmic value of probability of being occupied divided by the probability of being empty [39].
Since their creation the OGM methods have been the dominant paradigm to model environments; nevertheless, it is still a highly active research field. Several proposals to generate OGMs using different kind of sensors have been published, although recently most of researches have been using visual sensors such as CCD and RGB-D cameras. Next reviewed works are described.
The LIDAR sensors have been used in several works for capturing environment information and modeling through OGM. In [7], information is processed by assigning different probabilistic functions to each occupied grid-element. A Simultaneous Localization and Mapping (SLAM) task is carried out in [8], and this is made through a binary Bayesian filter as updating function, besides mutual information, to predict the uncertainty degree of each grid-element. In [9], a SLAM task is performed at hardware level by using genetic algorithms and updating just the grid-elements along the exploration area of the sensor. With the purpose of reducing hardware requirements, in [10] there are several sensors integrated that uniquely use integer arithmetic in order to represent the probability of grid-elements. A novel sensor model to map and detect moving objects is proposed in [11]. The LIDAR and radar sensors are used besides a Graphic Processing Unit (GPU) to generate maps and an approach based on histograms is introduced for detection of highways limits in [24].
Biological vision systems are a powerful tool to explore and gather information from the environment. These systems have inspired the development of vision cameras such as CCD and RGB-D, which have been widely used for different applications, e.g., obstacle detection and region segmentation [40]. In [20], there are generated 3D probabilistic OGM for the autonomous navigation of aerial vehicles. A system for automatic driving of cars in highways is proposed in [14]; the system processes 2D and 3D images and uses disparity maps to estimate the allowed highways sections to drive. Problems with moving obstacles are studied in [16, 23]; these works used particle-based OGM to detect the objects; each article has associated a position and velocity. In [17, 18], moving objects are detected through OGMs generated by means of stereo vision and a sigmoid-function-based updating rule. An automatic exploration method is proposed in [21], and this is based on the entropy change in order to avoid the future uncertainty in the map. Finally, in [19] a SLAM task was carried out by means of information extracted from structural data from architectonic blueprints with the purpose of seek and rescue applications.
Most of aforementioned works implement Thrun’s updating rule due its versatility to deal with a diversity of sensors. Particularly, OGM-based visual navigation systems provide a large quantity of information over the environment when they perform a certain task, and this is why we decided to design a specific updating rule that deals with such kind of systems. Also, regarding the difference of traditional OGM generation methods such as those proposed in [7–11, 14, 16–19, 21, 23, 24] our proposal is capable of detecting small obstacles. Albeit our paper presents advantages over traditional OGM-based methods, its main drawback is the long time it takes to process such amount of data.
3. OGM-Based Visual Navigation System
Our proposal is schematized in Figure 2. This is mainly build up by three functional blocks, i.e., acquisition and image enhancement (green block), camera calibration (yellow block), and OGM construction (blue block).

3.1. Acquisition and Image Enhancement (AIE)
This block provides the input images to CC and OGM construction blocks of the system. This block is defined by two modules: the Camera Module that captures images from robot’s environment and the Gamma Correction Module which enhances the image’s contrast to be processed or handled by further blocks. Figure 3(a) shows a sample of an original image captured by the camera module and Figure 3(b) shows the same image after enhancement with gamma correction module. Next, each module is described.

(a)

(b)
3.1.1. Camera Module
In this work, we considered a highly portable EVB1005 CCD camera from Opal Kelly, Figure 4(a), with the following technical specifications: an Aptina MT9P031 5-Mpixel CMOS image sensor, 14 fps at full resolution (2592 x 1944), 96 MHz pixel clock, 12-bit ADC resolution. Also, this camera can be easily connected to a ZEM4310 from Opal Kelly, which is based on the Altera Cyclone IV FPGA (Figure 4(b)) for deployment of real-time embedded robotic systems and high level IO communication.

(a) Camera sensor module

(b) FPGA module
3.1.2. Gamma Correction Module
In our approach, images were captured in a noncontrolled environment by a camera with a low aperture’s shutter; for this reason they present illumination variations. Thus, to enhance the colorimetric information of such images a gamma correction factor is applied. The factor is given by the following equation:where denotes the image pixel intensity and we define the correction factor in (see [41] for more details).
3.2. Camera Calibration (CC)
The CC block carries out the namesake main process of OGM-based visual navigation systems. Here, camera calibration is performed to extract metric information [27, 28] and finally construct a grid required to build the OGM for mobile robot navigation. Figure 5 schematizes the chessboard-like pattern resulting from CC process. The CC block assigns an index to label each pixel of an image according to the grid-element it belongs to and counts how many pixels are in each grid-element. For the herein proposed CC method, the camera sensor and the calibration pattern (a black and white chessboard-like pattern [27] with squares of size cm) are prefixed; the camera is set over the robot, while the calibration pattern is set on the floor and in front of the camera; this configuration allows the camera’s intrinsic parameters to be considered as constants. Thus only and homography operations are required to directly transform from real-world’s reference system into the OGM instead of using several intermediates homographies as a traditional CC method generally requires (see Figure 1).

Despite the fact that the CC block has the highest computational cost in the whole proposed system, the calculations and data acquirement are performed once. Besides, the proposal does not require to know the region that each pixel occupies per geometric projection, but per each grid-elements and the number of pixels and their spatial position that it contains are required. The grid-elements closer to the camera have a higher definition compared with those faraway.
The CC process is performed offline by the following modules: Harris Corner Detector, Definition of Grid-Element’s Components, and Grid-Element Pixel Counting.
3.2.1. Harris Corner Detector Module
It implements the Harris algorithm [42] to detect corners in the captured and enhanced images by the AIE module. This allows, either, to add missed corners or to remove those wrongly detected. The Harris’ algorithm returns the set of coordinates where the corners were detected. In Figure 6, the detected corners in the calibration pattern are shown.

3.2.2. Definition of Grid-Element’s Components Module
In this module, images are segmented in order to define the grid-elements of the chessboard-like pattern. For this purpose, it is necessary to group the grid corners of the calibration image in rows and columns as shown in Figure 7, where the corners are represented by yellow squares and organized in horizontal lines (in blue) and vertical lines (in red). For each horizontal line, its first element would be the most-left corner, whereas the upper one in each vertical imaginary line would be considered the first one. In the same manner, grid-elements can be defined by grouping rows and columns taking the first element similarly as what has been made for imaginary lines. This module is performed through the next four steps.(1) Horizontal Corners Grouping. The objective is to identify which corners of are part of the same horizontal line. For this, the Euclidean distances between the first corner of a horizontal line and the subsequent corners are calculated, considering only the components in the -axis; that is, the pairs of corners whose euclidean distance is less than a heuristic value (see (2)):(2) Vertex’s Grid-Element Estimation. To estimate the vertices that conforms a grid-element, the first corner from two contiguous horizontal lines must be in the same vertical line. This is verified by calculating and comparing the Euclidean distances and (see Figure 8); the former distance is calculated by using the first corner on a horizontal line and the first corner on the next horizontal line, and then is calculated by using the second corner and the same . Besides, (3) is used to figure out which detected corner is the left side of a grid-element; that is, if the left side of equation is greater than the right side then the left side of a grid-element is in the same row rather than the corner (see Figure 8(a)). Otherwise, the left side of a grid-element is in the same column rather than the corner (see Figure 8(b)). This is used to identify grid-elements through its conforming vertexes:(3) Line Segments Definition. By using the two-point form of the equation of a line, we define the segments that form each grid-element; that is, the slope and -intercept point are calculated.(4) Grid-Elements Labeling. Finally, each pixel is assigned a label with the number of the grid that contains it. Labeling is performed by evaluating the -intercept coordinates of the lines with the value in the -component position of the pixel. Then, the -intercept coordinate with the highest value is used to determine the pixel’s position with respect to the four sides of each calculated grid allowing finding out if the pixel is contained in any grid; if so a label according to the number of grid-element is assigned to the pixel; otherwise a zero value is assigned.


(a) is in the same line that

(b) is in the same line that
The result of these four steps is shown in Figure 9; here we can observe the chessboard-like pattern with the segmented and labeled grid-elements marked with yellow color.

3.2.3. Grid-Element Pixel Counting Module
This module has been designed to count the pixels of each grid-element over the whole grid according to the last step in the Definition of Grid-Element’s Components. The resulting information is stored in a hash table and it is later used besides Imagen Binarization module’s output in order that the POLM module can estimate the occupancy probability of grid-elements. A detailed description is presented in the next section.
3.3. Occupancy Grid Maps Construction
The global mapping corresponds to the exploration of the whole environment and in turn is built up by several Probabilistic Occupancy Local Maps (POLMs). In the beginning, an unknown environment has a OGM with a general occupancy probability that is equal to , which corresponds to an unknown occupancy state for all POLMs and its grid-elements, according to (4). To sense an environment through a POGM that requires generating several POLMs, the whole process is as follows (see Figure 10). The first step is performed by the Image Binarization Module which segments images captured in AIE block to distinguish between floor and obstacles regions. Then, due to the lack of knowledge about spatial and orientation information of the current POLM and the general OGM, an odometry process is performed as a second step. The Probabilistic Occupancy Local Map module, in the third step, generates a partial map according to the orientation of the camera, which will be used to define and update the OGM. The fourth and last step is the Update Module, which updates the OGM according to its previous occupancy states and the current occupancy states of its POLMs. These modules serves to either create a OGM of an unknown environment or update one previously mapped:

3.3.1. Image Binarization Module
In this module, an image coming from AIE module is segmented in two classes, which correspond to either floor (class 0) or obstacle regions (class 1). This is achieved by using a linear classifier trained with a gradient descent algorithm [43]. Here, each pixel is classified in one of the two classes according to its neighborhood (a window) information; from the window, mean and standard deviation are also obtained and used by the classifier as feature vector.
3.3.2. Odometry Module
Robot’s navigation requires the position and orientation of the robot according to its local and global reference axe. This task is commonly achieved through the odometry estimation method, which is based on geometric equations that provide an estimation of robot’s location by combining information obtained from wheel’s encoders and actuators. This method is widely used because it only uses the robot’s kinematic model and the external forces acting over the mechanism are not taken into account. In our proposal, the odometry module provides the robot’s position and orientation when an image is captured or during navigation (see [38] for more details). This module provides input to the Probabilistic Occupancy Local Map and Update modules with respect to real-world’s reference system . The orientation of the system is provided to the Probabilistic Occupancy Local Map module, whereas the spatial position is given to the Update one.
3.3.3. Probabilistic Occupancy Local Map Module
In this module, the Probabilistic Occupancy Local Maps (POLMs) are generated; each map has a dimension of grid-elements (covering a surface of m in the real world). To generate a probabilistic occupancy local map at time , we take the current image given by Image Binarization module and the information of the grid generated in the CC block. For each grid-element in the binarized image, pixels that were classified as obstacles are counted and used together with the quantity of pixels in the grid-element (obtained from CC block) to calculate its occupancy probability . A grid-element into the probabilistic occupancy local maps can have any of three occupancy states (), which are given by
3.3.4. Update Module
This module updates the OGM by means of its current status and that of the POLMs. The update is carried out over each grid-element by comparing its current occupancy probability value against its corresponding occupancy probability value on the OGM in the previous time. Next, we present the resulting cases of such comparison for the current grid-element and (6) shows the update rule for each of them.
Case 1. is empty and is occupied or is occupied and is empty.
Case 2. is unknown and the value of is not taken into account: where and are weight factors.
In Table 1, cases and update rules for the Probabilistic Occupancy Global Map are summarized.
4. Experiments
To validate our approach, we carried out numerical tests in a controlled environment, i.e., a room of m with fixed obstacles. A sketch of the environment is shown in Figure 11, where blocks in gray represent obstacles and some samples of captured images with different spatial location are shown; the arrows connected to sample images mark the direction or orientation of the robot’s sight in the environment. Furthermore, the floor in the environment is covered with tiles of cm, which were used as reference for comparing the real position of obstacles in the environment against the position of obstacles detected by our method. For capturing the environment’s images the camera sensor was set over a robotic platform called NI LabVIEW robotics Starter Kit from the National Instrument Company (NI); it has two DC motors, two encoders, and a reconfigurable board SbRIO-9632 (Single Board Reconfigurable I/O). The utilized navigation method was the potential fields; basically, it calculates the attraction forces produced by the target’s position and the repulsion forces associated to the objects; thus, the target’s position attracts the robot; meanwhile the obstacle’s forces repel it, and this allows the robot to navigate among obstacles and reach its target’s position [38].

5. Results and Discussion
In Table 2, we present numerical results on the environment’s metric information performed by the Camera Calibration process using a chessboard-like calibration pattern (see Figure 5). These values correspond to the grid’s information per grid-element, and they were further used to estimate the occupancy probability of maps.
Results for the image segmentation module in the OGM construction process are shown in Figure 12. From there, we can observe the captured and resulting images in odd and even columns, respectively. In general, it can be observed that the module’s output gives images with a good distinction between floor (black regions) and obstacle (white regions); for example, it can segment obstacles with significant size such as the desktop’ parts, but it is also capable of distinguishing small obstacles such as the power outlet.

Furthermore, in the even columns of Figure 13 we show the POLMs generated from the segmented images (odd columns). POLMs represent the visual field of the robot’s camera as occupancy grids with three different colors, white, gray, and black, that represent the occupancy probability for high, unknown, and low, respectively. The unknown probability mainly refers to pixels either outside of the grid’s region or behind an obstacle (outside of the visual field of the camera). Thus, for the POLMs shown in Figure 13 we have different grid-elements with different occupancy’s level for the obstacles detected in the environment; for example, in Figure 13(b) we show the POLM corresponding to Figure 13(a) with left side grid-elements empty (black color) and some grid-elements near and at middle of the grid occupied (white color) detecting the desktop’s part, and the rest of grid-elements, mostly at right side, have an unknown occupancy state because they are out of vision range (gray color).

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)

(p)
Finally, in Figure 14 we present the output of our approach where a precise Occupancy Grip Map with grid-elements has been constructed from the environment schematized in Figure 11. To get this, we run the system over the experimental environment following the trajectory shown in yellow in Figure 14. Also, we show some probabilities calculated by the system when an obstacle was detected.

To evaluate the performance of our proposal, we compare the quality of OGMs generated by our method against those obtained by Thrun’s updating rule [39], both applied to the images shown in previous section. Instead of performing a visual inspection of generated OGMs, which becomes highly subjective, we use OGM evaluation metrics which provide quantitative values and allow a numeric comparison.
Several OGM evaluation metrics have been proposed in the state of the art (e.g., [19, 26, 44–46]). We use three of them: Map score [26], Precision, and Recall [19], which are briefly described as follows.(i)Map score metric compares the M and N maps, through the sum of squared differences between their grid-elements and , respectively, at position . Generally, the () is used to compare an ideal map of the environment and a map generated by means of a method; thus, the lower the is, the more alike maps are. Next, the metric is shown:(ii)Precision and Recall metrics evaluate the quality of generated maps through a precision recovery analysis. The generated maps are treated as classification problems in which the objective is to correctly classify the pixels into occupied or empty classes. The pixels occupied in the reference environment are considered as true positives and the empty pixels as true negatives. The pixels of any other detected object are defined as false positive. Next, these metrics are shown:The Precision metric represents the percentage of obstacles correctly detected: And the Recall metric represents the percentage of obstacles correctly detected among all obstacles in the reference environment.
In order to use the aforementioned OGM evaluation metrics and compare our method against Thrun’s method, we obtained the ideal map and generated the OGMs by using our proposal and Thrun’s method; these are shown in Figures 15(a), 15(b), and 15(c), respectively.

(a) Ideal Map

(b) Our method

(c) Thrun’s method
The results of the OGM evaluation metrics applied to the generated OGMs are shown in Table 3. It can be numerically observed that our proposed method gets lightly better results for the three OGM evaluation metrics used for comparison. Even when numeric values are close for both generated OGMs, it can be argued that our proposal is lightly more alike to the ideal map because it has a lower value for the Score Map metric and has obstacles correctly detected according to Precision and Recall metrics because it has greater values than those obtained by Thrun’s method.
6. Conclusions and Future Work
In this work, we have successfully contributed to the field of visual robot navigation by improving the main process associated with the construction of navigation maps using occupancy grids. It is well known that OGM has been the predominant algorithm for navigation tasks in mobile robots such as localization, path planning, collision avoidance, people finding, etc.; however it is also true that this method demands high computational cost, specially for large map construction. Thus, we have proposed a novelty method that uses fewer homography operations, which significantly reduces the amount of calculations compared to previous methods that require camera calibration; this is achieved by keeping the extrinsic camera parameters fixed. Furthermore, this improvement of the CC process allowed us to treat simultaneously the two main processes involved in OGM-based visual navigation systems, i.e., camera calibration and maps construction. By treating simultaneously these processes, we get more precise navigation maps because the errors generated by rounding values while building maps are avoided. Three OGM metrics were used to compare our proposal against the Thrun’s method, and both generated maps that are quite similar but the maps generated by our proposal obtained values which indicate that they and the ideal map are more alike. A final remark is that our method is able to truly detect small objects as has been shown in the results.
On the other hand, OGM-based methods depend also on a correct positioning of the robot on the map, just based on the measurement of the odometry, which can generate a large cumulative error during a continuous operation of the algorithm. This problem is intended to be solved in a future work using visual odometry to complement the measurements, either by analyzing the optical flow or by using markers and in this way the displacement error is estimated and corrected. Also, to improve the processing time, a FPGA-based architecture can be implemented to take advantage of the parallelism features of our method and thus accelerate the current implementation.
Data Availability
The images used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This reserch has been supported by the CONACYT project FC2016-1961 “Neurociencia Computacional: de la teoría al desarrollo de sistemas neuromórficos,” Publicación Financiada con recursos PFCE 2019.