The Navigation of Home Service Robot Based on Deep Learning and Machine Learning

Yan, Yupei; Ma, Weimin; Wong, Sengfat; Yin, Xuemei; Pan, Qiang; Liao, Zhiwen; Lin, Xiaoxin

doi:https://doi.org/10.1155/2024/5928227

Journal of Robotics

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2024 | Article ID 5928227 | https://doi.org/10.1155/2024/5928227

The Navigation of Home Service Robot Based on Deep Learning and Machine Learning

Yupei Yan,¹Weimin Ma,¹Sengfat Wong,²Xuemei Yin,¹Qiang Pan,¹Zhiwen Liao,¹and Xiaoxin Lin¹

Academic Editor: Changsheng Li

Received23 Aug 2022

Revised07 Oct 2023

Accepted20 Jan 2024

Published12 Feb 2024

Abstract

This paper discusses how to improve the accuracy of navigation for home service robot based on the deep learning and machine learning. First, the crawling programing is applied to collect enough images of fridge and washing machine on the web; a deep learning framework is proposed that can distinguish fridge and washing machine more accurately. Following, the data come from the robot operating system topics are collected and cleaned, the linear regression, decision tree, and linear SVR algorithms are applied and compared to predict the power consumption of the robot, and a conclusion is obtained that liner movement will consume more power, which provides a reference for the path planning of the robot. Lastly, the conclusions are proposed that a novel methodology is applied to distinguish different home appliances, which is useful for the accurate navigation of the robot; the liner movement will consume more power compared to turning left or right, which supplies a reference for the optimized path planning for the robot.

1. Introduction

Home service robot has been researched for many years, but there are still some problems for the application of them. One problem is the accurate location of the home service robot in the house, where is an indoor dynamic environment; because of the limited geomagnetic field signal in the indoor environment, the data from inertial measurement unit (IMU) sensors are drifting dramatically in the indoor environment, there is accumulated error of odometer data, which will be more obvious when robot moves for long time.

The map can be built to solve the location problem, but there are still two problems. The first problem is that it is a dynamic environment inside the house; people and animals are moving in the house, and when the robot needs to build the map, people need to stay in the house to control the movement of the robot, which becomes the interference factor for the map building. The second problem is that the position of home appliances and furniture will be changed by people sometimes, or some new home appliances and furniture are added, and old home appliances and furniture are thrown away, which needs the robot to build the map again frequently and brings the trouble for people.

Object recognition is a competitive option for the navigation of the robot in the house [1]; people just tell robot to move to the fridge to take some food or tell robot to take dirty clothes to the washing machine, and the robot can move to the position of the fridge or washing machine by recognizing the appearance of them in the house, which likes the human action in the real world.(i)One of the contributions of this paper is that it proposes a novel deep learning framework to distinguish the appearance of fridges and washing machines more accurately when they look similar or when they are partly shielded in the dark environment, which is useful for the robot to judge whether it moves to the correct and accurate position to finish some tasks, compared with other navigation algorithms in the indoor and dynamic environment, this methodology improves the accuracy for the navigation of the robot.(ii)Besides the accurate navigation of the home service robot in the indoor environment, how to save the battery power is also important; the power will be different when robot moves forward and backward or turns left and right. The data of movement are relative with the odometer data in the robot operating system (ROS), which can be read in the ROS topics; the original data contain many noise data values, abnormal values, missing values, and duplicate values, which need to be cleaned from the original data, after data cleaning, the data need to be normalized and uniformed to eliminate the unevenness of the numerical value in the data table. The classified data and predicted data should be assigned, and then the various machine learning algorithms are running to analyze which kind of movement will consume more power; it guides the robot to try to take less such movement and try other less power consumption movements in the navigation [2].(iii)The another contribution of this paper is that a conclusion is obtained that the liner movement will consume more power compared with the rotation, which comes from the result of the machine learning algorithms; it supplies a novel methodology to analyze power consumption; also, it supplies a reference that the robot should try to move less linearly when planning the least power path in the house where is an indoor and dynamic environment.(iv)Compared with the mathematical and analytical methodologies of literature, the methodologies in this paper are proposed based on the data analysis algorithms, such as deep learning and machine learning, which provide another novel idea to solve the problem of accurate navigation of the home service robot.(v)This paper proposes two items; one is accurate navigation of the home service robot in the indoor environment based on visual recognition of the home appliance through the novel-designed deep learning framework. The other is that the power consumption of the home service robot is predicted based on three machine learning algorithms, the corresponding data are read from the topics of ROS, and the data are cleaned and uniformed before running the machine learning algorithms, which provides a reference for the optimized path planning.

2. Location by Deep Learning

The home service robot usually works in a house where an indoor dynamic environment, with the home appliances and furniture removed, added, and changed position. The accurate navigation of the robot by laser scan sensor seems difficult; if robot can recognize the position of home appliances and furniture by camera, the service function will be intensified.

Tensorflow Object Detection library can be loaded in Ubuntu and transformed as the message format in ROS, which can recognize the home appliances and furniture in the house in ROS, but the confidence becomes lower when the light is dimmed or sheltered by people, animals, plant or other objects, or taking video from the top view of the object by the camera on the robot arm, or taking the video of a part of the object by camera, as shown in Figure 1.

Another problem is that it seems difficult to distinguish two similar objects when they are placed together in a dark environment, like fridge and washing machine are placed together in the house, as shown in Figure 2.

Usually, the other home appliances are easy to be recognized and distinguished by the Tensorflow Object Detection library or other models and libraries; the appearance of fridges and washing machines are rectangular columns, it is difficult to be recognized and distinguished in some special status, so the deep learning should be applied to help robot recognize the fridge and washing machine more accurately.

2.1. Web Crawler

In order to obtain enough pictures of fridges and washing machines to train the deep learning network, the web crawler programing is applied to obtain them on the internet. By input the keyword “fridge” or “washing machine” on the images item of the search engine to get the URL of them.

The request and re library are imported, the request.get () function requests the content of the URL by adding the header information, because the requested content is in image format, the content.decode () function is called to transform the format. The re.findall(“objURL”:“(.?)”,’, htlm_1) function calls regular expressions to search for images that match the conditions, “objURL” is the dict format data with the key-value result, the “.?” can traverse the entire web page code with less strings, the following is judging if the current result has been traversed already, if it is true, the current result will be dropped, the whole process will be circulated until the required quantity of images is obtained.

In this paper, the images of fridge and washing machine are crawled, as shown in Figures 3 and 4. A script is written to rename the crawled images as “object.number.jpg” format which is prepared for the following deep learning. The quantity of crawled images is 2,600, which can be divided into train set and test set. For washing machine, there is a total of 1,300 images, in which the train set is 1,000 images and for the test set is 300 images. For fridge, the train set is 1,000 images, and for the test set, it is 300 images as well, which can make sure the same recognition ability for fridge and washing machine.

2.2. Deep Learning Framework

There are lots of popular deep learning networks, such as LeNet5, Alex Net, Vgg16, etc. Because the purpose of the deep learning network in this paper is to distinguish the fridge and washing machine more accurately, which are two classifications and the respective characteristics are obvious, the deep learning network can be revised from LeNet5 and improved to be more suitable for the distinction task in this paper, which can be shown in Figure 5.

The difference from LeNet5 is following: the first is that it is an RGB color image but not a gray image, so it is three-channel image which is red, green, and blue; the second is that it is two classifications but not 10 classifications, the relationship of features is not so complex, so there is two dense layers but not three dense layers; the third is that the detailed information in training image is abundant, there needs more pixels to show the abundant information, so the input is normalized as 150 × 150, but not 28 × 28. After the improvement of the LeNet5 network, the proposed deep learning network will be more suitable for the distinction task in this paper.

As shown in Figure 6, in order to process the following data, the size of all images has been normalized as 150 × 150, the size of the first Conv2D is 3 × 3, the learning stride is 1, so the output is 148 × 148, the quantity of convolution kernel is 16, this layer scans the whole image for the next step processing.

Because it is a valid convolution, the output shape is 150 − 2 = 148, and there are total 16 convolution kernels whose size is 3 × 3; they are three-channel images; if adding the bias value, the quantity of parameters is (3 × 3 × 3 + 1) × 16 = 448.

The kernel of the next MaxPooling2D layer is 2 × 2, and the size is 74 × 74; it extracts the max value of the 2 × 2 square as the characteristic in this area. There is no parameter counted in the pooling layer.

The above procedure has been circulated three times to take the max value as the obvious characteristic in the image; the parameters can be counted as above paragraphs, the flatten layer is the 1D vector which can be applied to the next fully connected layer; finally, the output result should be true or false to judge if it is fridge or not, so the output of dense_1 layer is 1. The total parameters are the summation of parameters in each layer.

2.3. Training and Test

2.3.1. RMSprop

In order to solve the problem of too large swing amplitude in the update process in the loss function, as well as accelerating the convergence speed of the function, RMSProp algorithm uses the differential square weighted average for the gradient of weight W and bias B, in Formula (1)–(4), W represents the horizontal axis, b represents the vertical axis, dw represents the change in the direction of the horizontal axis, db represents the change in the direction of the vertical axis, S_dw is the result of the gradient squared weighted average when there is changes in the direction of the horizontal axis when learning, S_db is the corresponding gradient squared weighted average for vertical axis, α is the learning rate, β is similar to the attenuation factor in the momentum gradient descent method, representing the influence of the past gradient on the current gradient, and the general value is 0.9. In case the denominator is zero, a very small value is added on the denominator; usually, it is 10⁻⁸. and are the gradient momentum that is accumulated by the loss function during the first t − 1 rounds of iteration.

The RMSProp algorithm calculates the differential square weighted average of the gradients, which is beneficial to eliminate the direction of large swing amplitude, correcting the swing amplitude, and making the swing amplitude of each dimension is small. On the other hand, it also makes the network function converge faster.

2.3.2. Binary Cross Entropy

Binary cross entropy is a common loss function in binary classification problems.

The is the binary label 0 or 1, which means if the result is fridge and washing machine in this paper, p (y) is the probability, and the output belongs to the label y. As a loss function, binary cross-entropy is used to judge the quality of the prediction result of the model. Generally, for the case where the label y is 1 and the predicted value p (y) approaches 1, then the value of the loss function should approach 0; otherwise, if the predicted value p (y) approaches 0, the loss function will be very large [3].

2.3.3. Training Process

For three-channel image, the scope of integer value for RGB is 0–255, which is too large to affect the predicted result, ImageDataGenerator() function will transform the value to the float whose scope is 0–1 before the training process starts, which makes the image resistant to geometric transformation, for example, it can reduce interference caused by uneven light in images, etc.

The training process will take many GPU resources; the value of “allow_growth” should be “True” to control the GPU resource gradually.

The batch size is 20, the epoch is 15, and the target size is 150 × 150, because binary cross entropy loss is applied, the class_mode is “binary,” there are 100 steps in each epoch, the return times of validation generator are 50, the log is recorded for each epoch. After the above parameters are setup, the training process starts.

From Figure 7, it is obvious that the loss becomes lower and accuracy becomes higher in each epoch; the accuracy reaches 99.3% in the 15th epoch, which means that the model has a perfect distinction function for the training images, but the accuracy can not reach so high for the validating images.

From Figure 8, the loss near zero in the 15th epoch that shows the fitting is so good for the training images, but the validation loss is drifting in 15 epochs, it shows that the model is not so perfect for the test images, there are lots of reasons for this problem; usually, it is caused by the overfitting, while it also means the fitting is not suitable for the totally new test images.

The final result is the probability of fridge, which is called validation accuracy; if it is below 50%, it means the image is a washing machine; otherwise, it means the image is a fridge. In epoch 1, the validation accuracy is 50.7%; for the two classifications, it is nearly half of the full probability, which seems reasonable because it is the 1st epoch. In epoch 15, the validation accuracy is 83.5%, which means that the classification function has been improved a lot for the new images in the test set. Because there are some special images in the test set, as shown in Figure 1, the model can help robot to distinguish the fridge and washing machine when they are shielded or only part of them are observed.

3. Path Planning by Machine Learning

Usually, quintic polynomials are applied in the path planning for arm, mobile robot, and UAV; for the robot that is powered by battery, good power consumption management is not only for the cost, heat control, and charging time but also it can save the life of the battery and reduce the warranty times.

Usually, power consumption is the multiplied term of the voltage and current, which relates with the motion of wheels for the home service robot, and the motion of wheels relates with the movement distance and rotation angle, which are read from the odometer, so if the relationship between odometer data and power value can be analyzed, the optimized path planning will be clear.

In order to test the power consumption more accurately, only base_control topic is launched in ROS; other topics like robot_navigation, robot_vision are not launched, so devices like laser scan sensor and camera do not work in this situation.

3.1. Processing and Analysis of Power Consumption Data

In ROS, odometer data can be outputted from/odom topic, voltage, and currents value can be outputted from/battery topic; both of them can be recorded by inputting command “rostopic echo /odom > odom.txt,” “rostopic echo /battery > battery.txt” in Ubuntu.

In the test, in order to collect the odometer data and power value evenly in all directions, the tested mobile robot “nanorobot,” which is controlled by application software in the mobile phone to move randomly while avoiding collisions in each direction in the home, as shown in Figure 9, “nanorobot” is produced by WeiEmbedded company; the main components of “nanorobot” are laser scan sensor, binocular depth camera, the model number of NVIDIA GPU board is Jetson Xavier NX, the operating system (OS) is Ubuntu 16.04 LTS, the version of ROS is Kinetic.

From Figure 10, the existing key “secs” can be recorded as “second,” which is recorded since 0 : 00, January 1, 1970 in Linux system; “position: x, y, z” can be simplified as “px, py, pz”; “orientation: x, y, z, w” can be simplified as “ox, oy, oz, ow”; “linear: x, y, z” can be simplified as “lx, ly, lz”; “angular: x, y, z” can be simplified as “ax, ay, az,” as shown in Table 1. The /odom topic is published many times in one second, so the value of many rows belongs to one second; for example, the value with an index from 0 to 9 belongs to one second 1,691,289,812 in Table 1.

So, the values of each key in one second should be averaged, and taking “second” as the primary key, which can be expressed in Table 2.

From Figure 11, for the output data from/battery topic, the existing key “secs,” which can be recorded as “second”; other keys that need to be extracted are “voltage” and “currents,” as shown in Table 3. The battery topic is published only one time in each second; there is no need for the values to be averaged in one second and take the “second” as the primary key; the table can be expressed as Table 4.

(i)In Table 4, the power is equal to the voltage multiplied by currents, so the new key “power” can be added into Table 4. Comparing Tables 2 and 4, the key “second” is the primary key in both tables, so both tables can be merged as a whole table by the key “second,” which can be shown in Table 5.(ii)In Table 1, the key “px,” “py,” “pz” means the position of the robot in three dimensions coordinate axes, “ox,” “oy,” “oz,” “ow” means the quaternion value which represents the orientation of the robot in three dimensions coordinate axes, they represent the gesture of the robot in one moment, but they have no relationship with the power consumption of the robot, so they can be deleted from Table 1. the key “lx,”“ly,” “lz” means the linear velocity of the robot in three dimensions coordinate axes, the key “ax,” “ay,” “az” means the angular velocity of the robot in three dimensions coordinate axes, in which x, y, z is the rotation axis, because the home service robot moves on the horizontal surface, the value of linear z and angular x, y can be deleted from the whole table. Because the key “‘power” is in Table 5, the keys “voltage” and “current” can be deleted, so only keys “lx,” “ly,” “az,” and “power” are kept, as shown in Table 6.(iii)In order to measure the power consumption for the movement of the robot, the values of keys are recorded when robot keeps in both stopped and movement status.(iv)From Table 7, it is obvious that when robot keeps in stopped status and does not move again, there is still power consumption in each second, which comes from the main board of the robot. So, in order to obtain the power only for the movement of the robot, the value of key “power” when values of other keys are not zero should reduce the value of key “power” when values of other keys are zero, which can be expressed as the following formula: (v)The following is how to define the power value when the values of other keys are zero, and the first is computing the averaged power value when the values of other keys are zero, which represents computing the power value when the robot keeps stopped, then dropping the rows whose values of other keys are zero, which means dropping the values when the robot keeps stopped; finally, all the power values reduce this averaged power value, in this way, the balanced power values can represent the power consumption for the movement.(vi)The next is to eliminate the noise and abnormal data. The first is clearing the noise data; because of sensor error or systematic statistics error, some power value points will become much higher or lower in some status, so if the values of each key, such as value of “lx,” or “ly,” or “az,” or “power” at this second is three times larger than the corresponding values at the previous one second and the next one second, or 1/3 times smaller than the corresponding value at the previous one second and the next one second, this value can be considered as the noise value, whose row should be dropped. The second is clearing the abnormal data, and the universal method is the box plot algorithm, which can drop the data that is out of the normal range.(vii)Usually, when a car starts moving, the initial speed is low, but the fuel consumption is large; when the car keeps a stable, fast speed, the fuel consumption is not so large as it starts movement, because when the car starts moving, the accelerated speed is large, based on the Newton’s second law, the force is large, so the power is large, it can be concluded that power is in direct proportion to the accelerated speed, which can be regarded as the value of key “lx”, “ly”, “az” in this row reduce the corresponding values in the present row.

Because the values in the tables are all float type, it is not convenient for the processing of some algorithms, such as decision tree algorithm, etc.; all values should be transformed into integer format. Meanwhile, in order to eliminate the influence of big data in the tables, all values should be normalized between 0 and 100,000, the formula can be listed as follows:

As shown in Figure 12; when green and yellow lines become higher, the red lines become higher, when green and yellow lines become lower, the red lines become lower, but this phenomenon is not so obvious for the blue lines, so the value of linear x and linear y have more influence to power consumption than the value of angular z. Because the robot keeps stopped from the second 1,691,290,153–1,691,290,431 and after the second 1,691,291,512, the values of “lx”, “ly”, “az” are zero and have been dropped already in the previous steps, so there are only straight lines during those periods in Figure 12.

From the Pearson correlation degree matrix in Table 9, the same conclusion can be obtained, that the linear x and linear y have more influence on power consumption; it means that when robot moves linearly, such as move forward or backward, the power will be consumed more, when robot have angular velocity, such as turn left and right, the power will be consumed less.

Besides the rotation and straight line, the moving path can be a curved line; in Figure 13, the green rectangle is the mobile robot, the red ellipse is the obstacle that appears suddenly in front of the robot in the dynamic environment, the destination is the yellow rectangle. In order to avoid the collision with the red obstacle, there are three optional moving paths that can be chosen [4]; the first is the orange color arc line, the second is the blue line, the third is the green line, the blue line can be divided into five stages which are S1–S5, the green line can be divided into two stages which are d1 and d2.

If the blue line can be divided into lots of stages, the length of orange arc line L can be taken as the infinite serial method of summation for the length of the blue line in each stage, as shown in Formula (8). It is obvious that

So, for a small part of the orange arc line, it can be taken as that the mobile robot moves along a very short blue straight line while the moving angle is the black arrow line in Figure 13, which points to the tangent line of the orange arc line [5].

For the orange arc line, the robot should rotate the angle α1, which is the angle between the purple color dash line and Y axis first, then rotating along the black arrow line until it reaches the destination [6], the summation of the angle along black arrow line is α2 which is the angle between two orange dash lines, one is the vertical orange dash line which is parallel with Y axis, the other is the nearly horizontal orange dash line which is the extension line of orange arc line.

If the mobile robot moves along the green line, it should rotate the angle θ1 first, which is the angle between the purple color dash line and gray dash line, in which the purple dash line points to the rotation angle of the mobile robot when it is in the initial position [7], the gray dash line points to the rotation angle when it is in the second position after the rotation. Then, the mobile robot moves along the green line d1, following rotation, the angle θ2, which is the angle between the gray color dash line and green line d2, which points to the destination [8].

If the mobile robot moves along the orange arc line, the rotation angle is α1 + α2; if it moves along with the green line, the rotation angle is θ1 + θ2. From Figure 13, α1 > θ1, α2 > θ2, so α1 + α2 > θ1 + θ2, which means that the mobile robot will rotate at a larger angle if it moves along the orange arc line than it moves along the green line [9].

Combined with the moving distance and moving angle, it can be concluded that moving a green straight line will save more power than moving an orange arc line, so for the least power consumption path planning, the mobile robot should try to take rotation and move a straight line.

The following is the comparison of the blue line and the green line, assuming that the mobile robot only moves straight line but not moves arc line in the whole process; from the blue line, it is obvious that more rotation will lead to more distance from the summation of short straight lines.

Although rotation will consume less power than move straight line, but more rotations will lead to a longer path; also it will lead to a larger rotation angle because of the total summation of the rotation angle in each stage.

From the above analysis, the path of the orange arc line, more stages of the blue line, and the green straight line are considered comprehensively; in order to save the power of the battery, the path should be designed as the combination of least rotation angle and straight line movement as green line shows.

3.2. Machine Learning

If the power consumption is considered an influential factor for the path planning of the robot, the machine learning algorithm can be taken into consideration to predict the power consumption issue in this paper.

Before running the machine learning algorithm, the train data and test data should be split from the values of Table 8 with train_test_split() function [10]; there are the total 1,146 rows in the power consumption table, in which the train data set contains 1,046 rows data, the test data set contains 100 rows data.

Because the distribution of the predicted power value is during 1–1,00,000, it is difficult that the predicted power value is totally equal with the true power value. So, the error-tolerant rate is proposed to see if the predicted power value is in the range of proportion of true value. For example, predicted value ≤ true value 1.3 while predicted value ≥ true value 0.7. So it can be summarized as follows: predicted value ≤ true value “error-tolerant rate” while predicted value ≥ true value (2 − “error-tolerant rate”). The value of error-tolerant rate can be taken from 1.1 to 1.6.

3.2.1. Linear Support Vector Regression (Linear SVR) Algorithm

The support vector algorithm is applied to obtain a partition hyperplane that can split the two types of data more separately; the linear SVR algorithm is evolved from the support vector algorithm, which is applied to calculate the loss value when the predicted value and true value have some bias.

In Figure 14, the green line means the error-tolerant rate is 0.7, that the line for the predicted value equals with 0.7 × true value, the yellow line means the error-tolerant rate is 1.3, that the line for the predicted value equals with 1.3 × true value, the red line means the predicted value is totally equal to true value, the data which are distributed between green line and yellow line are the values that can be accepted, which means that the predicted result is correct when the data are distributed between green line and yellow line.

In Figure 14, the data are distributed more evenly and in a wider range; the variance is larger than in Figure 15, and more data are distributed out of green lines than in Figure 15, the reasons should be analyzed in the following work.

3.2.2. Linear Regression Algorithm

Linear regression algorithm is a universal machine learning algorithm that predicts the data by minimizing the mean square error.

From Figure 15, it is obvious that all the values are distributed nearly parallel with X axis; the reason should be analyzed in the following work.

3.2.3. Decision Tree Algorithm

The decision tree algorithm is a tree-like structure, each node inside the tree represents the test of a feature, the branch of the tree represents each test result of the feature, and each leaf node of the tree represents a category.

In Figure 16, some data are distributed in abnormal scope, whose predicted value is large while the test value is small; the reason should be analyzed in the following work.

3.2.4. Comparison of Three Different Machine Learning Algorithms

In Table 10, the primary key in the table is error-tolerant rate whose value is taken from 1.1 to 1.6, and the three different machine learning algorithms are compared, if the error-tolerant rate is 1.3, the accuracy of linear regression is 88%, which is the highest in three machine learning algorithms.

From Figures 14–16, it is obvious that the data are distributed more spread in the decision tree algorithm, the variance is larger, so the mean squared error is the largest in three algorithms. The data are distributed more concentrated in the linear regression algorithm, the variance is smaller, so the mean squared error is the smallest in three algorithms.

In Figure 15, the data are integrated into a fixed area that nearly parallel with X axis, the predicted values are trends to equal to the mean value, which is about 50,000, and the error of predicted values is distribution in a limited scope [11], which is more relative to the error of training data, so the r2 score of linear regression algorithm is near zero, but this cannot be explained in Figures 14 and 16, so the r2 score of the linear SVR algorithm and decision tree algorithm is smaller. The reason why all r2 scores of the three algorithms are negative should be analyzed in the following work.

In Formula (7), all values are normalized between 0 and 1, then magnified 100,000 times; it is almost impossible that the test value is equal to the predicted value, which can be reflected in the first row in Table 10, so the precision score, recall score, f1 score are all zero in Table 11.

The motion of wheels will consume power, the other devices on robot will consume power as well. In the test, there is an NVIDIA Jetson nano main board on the nanorobot, which will consume the power a lot and take more percent of power consumption, especially when the velocity of robot is little. Because the accuracy and deviation of the linear regression algorithm are acceptable compared with linear SVR and decision tree algorithm, the predicted result of power consumption of the linear regression algorithm is better.

4. Conclusion

In this paper, two issues are tried to be solved for home service robot; one is the distinction of fridge and washing machine in some special situations, such as they are shielded by people or animals, or the light is dimmed or just shows a part of them in the camera of robot. The other issue is that how to judge the power consumption, which can provide a reference for how to plan the path to save power for robot.

Deep learning is applied to solve the first issue, the test result shows that the training accuracy is nearly 100%, but validation accuracy only reaches 83%, so the validated result is not so perfect; the reason maybe overfitting, so how to prevent overfitting becomes the next step work, usually the regularization method can be applied to prevent overfitting and improve the generalization ability of the model, or the parameters can be adjusted appropriately so that the value can be more accurately adjusted during BP back propagation, or the sigmode activation function can be changed into ReLU activation function, or part of the fully connected layer can be changed into partial connected layer, or some pooling layers can also be added, deleted or modified, and the size of convolution kernel can be modified to improve the prediction accuracy of the model, etc.

For the second issue, there is a conclusion obtained that curve line movement will consume more power than straight line movement, and straight-line movement will consume more power than just rotation at one fixed position. The three different machine learning algorithms are applied to predict the power consumption; the test result shows that the predicted accuracy is not good; the reason comes from many factors; the first is the data from IMU sensors is not accurate in the indoor environment, the second is there are lots of gaps in the encoder which brings the error, the third is the friction on the ground will bring the error, the last is that the parameters of the machine learning models should be adjusted accurately to improve the accuracy of the prediction, which can be completed in the following work.

The conclusions obtained in this paper can be applied on the swarm of home service robots that communicate with each other by the distributed network in ROS [12]; for example, when one node which has powerful computation resources finishes the object recognition work with deep learning algorithm, it can send the corresponding results to another node which has less hardware resources with the distributed communication framework, which can realize the load balancing function in the distributed network.

Another application is the home robots for telemedicine/telehealth; for example, when elders need to measure the value of electrocardiosignal, blood pressure, and blood oxygen, the robot, which is equipped with physiological monitoring instruments can plan the path from its position to the position of elders with less power consumption by the proposed machine learning algorithms, then recognizing the fingers, wrist and arms of elders more accurately by the proposed deep learning algorithms.

The conclusions can be applied in other more application scenarios, which bring more development opportunities for the home service robot.

Data Availability

The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by The Research Project of Zhuhai City Polytechnic College in 2021 (KY2021Y01Z), Project of Qualified Teachers in College in 2021 (ZLJS2020120622, ZLJS2020120623, and ZLJS2020120624), Guangdong Province Science and Technology Innovation Strategy Special Funding in 2022 (pdjh2022b0992), Guangdong University Innovation Team Project (Natural Science) (Project No. : 2023KCXTD078), and Science and Technology Innovation Team of Zhuhai City Polytechnic (Project No. : KJCXTD202005).

References

S. Zhu, “Automatic recognition of facial expression based on computer vision,” International Journal on Smart Sensing and Intelligent Systems, vol. 8, no. 3, pp. 1464–1483, 2015.
View at: Google Scholar
H. Miao and X. Huang, “A heuristic field navigation approach for autonomous underwater vehicles,” Intelligent Automation & Soft Computing, vol. 20, no. 1, pp. 15–32, 2014.
View at: Publisher Site | Google Scholar
G. Ryou, Y. Sim, S. H. Yeon, and S. Seok, “Applying asynchronous deep classification networks and gaming reinforcement learning-based motion planners to mobile robots,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6268–6275, Brisbane, QLD, Australia, 2018.
View at: Google Scholar
M. Radmanesh, M. Kumar, and P. H. Guentert, “Overview of path-planning and obstacle avoidance algorithms for UAVs: a comparative study,” Unmanned Systems, vol. 6, no. 12, pp. 95–118, 2018.
View at: Google Scholar
G. H. Xu, T. W. Zhang, and Q. Lai, “A new path planning method of mobile robot based on adaptive dynamic firefly algorithm,” Modern Physics Letters B, vol. 34, no. 29, pp. 159–167, 2020.
View at: Google Scholar
S. M. Sombolestan, A. Rasooli, and S. Khodaygan, “Optimal path-planning for mobile robots to find a hidden target in an unknown environment based on machine learning,” Journal of Ambient Intelligence and Humanized Computing, vol. 10, no. 5, pp. 1841–1850, 2019.
View at: Publisher Site | Google Scholar
Z. Jian, S. Zhang, S. Chen, Z. Nan, and N. Zheng, “A global–local coupling two-stage path planning method for mobile robots,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 5349–5356, 2021.
View at: Publisher Site | Google Scholar
M. Alireza, D. Vincent, and W. Tony, “Experimental study of path planning problem using EMCOA for a holonomic mobile robot,” Journal of Systems Engineering and Electronics, vol. 32, no. 6, pp. 1450–1462, 2021.
View at: Publisher Site | Google Scholar
R. Fareh, M. Baziyad, T. Rabie, and M. Bettayeb, “Enhancing path quality of real-time path planning algorithms for mobile robots: a sequential linear paths approach,” IEEE Access, vol. 8, no. 3, pp. 167090–167104, 2020.
View at: Publisher Site | Google Scholar
T. Li, D. Ho, C. Li, D. Zhu, C. Wang, and M. Q. H. Meng, “Houseexpo: a large-scale 2d indoor layout dataset for learning-based algorithms on mobile robots,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5839–5846, Las Vegas, NV, USA, 2020.
View at: Google Scholar
M. Chandrasekaran, M. Muralidhar, C. M. Krishna, and U. S. Dixit, “Application of soft computing techniques in machining performance prediction and optimization: a literature review,” The International Journal of Advanced Manufacturing Technology, vol. 46, no. 5–8, pp. 445–464, 2010.
View at: Publisher Site | Google Scholar
C. Tatino, N. Pappas, and D. Yuan, “Multi-robot association-path planning in millimeter-wave industrial scenarios,” IEEE Networking Letters, vol. 2, no. 4, pp. 190–194, 2020.
View at: Google Scholar

Copyright

Copyright © 2024 Yupei Yan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies