Abstract

Material perception is one of the several essential abilities required by robots, and it also poses significant challenges in real-world field applications. In harsh environments, the existing material recognition methods have the problem of low accuracy. In this paper, ultrasonic technology is applied to the field of material recognition, which is derived from the predatory strategy of echolocation animals. We propose a novel noncontact material recognition method of surrounding objects for robots based on ultrasonic echo signals, which can be used in external extreme environments. This method primarily adopts the 16-dimensional feature vector extracted from intrinsic mode functions that we gain from empirical mode decomposition as inputs of machine learning algorithms to recognize different materials, and we use K-nearest neighbor, decision tree, and support vector machine algorithms on the feature vector set to decide the best classifier, and its acoustic theoretical model is established additionally. The experimental results validate the accuracy of the acoustic theoretical model and the effectiveness of the proposed method. Compared with the existing methods, the proposed method improves the low accuracy and the poor recognition effect in ultrasonic material recognition. This method provides a new idea for robots to recognize and perceive materials in extreme environments.

1. Introduction

Perception is a fundamental capability for robots to perceive the surrounding world and their states so that they can accomplish specific tasks and actions. In a complex environment, robots need to enforce different missions which need the ability to gather more accurate information about environments and themselves to execute decisions, e.g., performing navigation, object grasping, and path planning. Material recognition is the further development of robot technology for target perception. The method for robots to perceive the material of surrounding objects is of great significance to enhance the robot’s ability to understand environmental semantic information, which is greatly critical for robots [13]. When a mobile robot is in an unfamiliar environment, the fixed-motion mode brings great uncertainty to the robot’s autonomous action due to the variability of the ground environment. By accurately identifying the ground material, it can provide environmental information to support the robot’s subsequent motion planning scheme, which will strengthen the robot’s autonomous capability. In addition, the robot is able to grasp different objects more accurately to avoid destroying the target object with a fixed grasping force [4, 5].

In order to achieve more accurate perception, several material recognition methods are implemented. Many scholars have made gratifying achievements. Zhang et al. [6] introduced a novel approach to identifying materials by using reflectance, and they demonstrated the effectiveness of reflectance for material identification experimentally. Tanaka et al. [7] used the depth distortion of time-of-flight (ToF) measurements as features and achieved material classification. Sinapov et al. [8] proposed a bionic interactive robot surface recognition and surface classification approach based on vibration and haptic perception modalities. Jamali and Sammut [9] used machine learning algorithms to distinguish different materials based on the five main Fourier transform frequency values of each material with 95% accuracy.

Among these methods, the visual recognition method has the advantage of high accuracy, good repeatability, and strong versatility. The tactile-based method is also commonly used for material recognition, which is easy to operate and has a high recognition accuracy. Nevertheless, in extreme environments with complex and dangerous conditions that are dark, dusty, and filled with poisonous gases, robots can not use these visual and tactile-based methods for external environment information acquisition. Cameras are very sensitive to the light intensity in the environment and have strict requirements on the lighting conditions. When the lighting conditions are not ideal, especially in dark conditions, the specific information of the target object cannot be accurately obtained [10]. Lasers are highly coherent and directional, do not easily bypass obstacles, and can easily be disturbed by the dust in the air during their propagation. Therefore, lidars perform poorly in harsh environments such as dusty, snowy, and hazy environments [11]. These visual methods based on these common optical sensors, such as lidars and cameras, cannot be able to receive accurate external environment information, for these sensors may not be able to work properly [12, 13]. In addition, the tactile-based methods seem to have certain limitations. Most of these methods may require close contact with the object to be measured and some of the methods may require additional incentives. In terms of practical application for robots, this tactile-based method probably has many drawbacks.

However, in such severe scenarios, robots still need to acquire accurate environmental information to perform basic work as usual. Thus, how to recognize the materials and robustly differentiate them in the harsh environment becomes more and more important in recent years. In nature, bats and cetaceans not only detect and localize objects in their environment but are also able to classify shapes and textures based on the acoustic cues embedded in returning echoes. Bats use ultrasonic signals to perceive obstacles, recognize prey, and even track prey under extreme environments. Many cetaceans use ultrasonic echo to detect different prey or obstacles using ultrasound produced by their vocal system in harsh environments [1416]. Therefore, inspired by these phenomena, ultrasonic sensors can be equipped on robots and applied to perception in extreme environments. These ultrasonic echoes are extremely useful for locating and identifying objects. Over the years, scholars mainly use the ultrasonic echo signals of the surrounding object to accomplish the accurate identification of different materials. Politis and Probert Smith [17] modeled and analyzed random rough surfaces such as carpet, asphalt, tile, and wood floors and proposed a material surface texture classification method based on a geometric scattering model. Smith and Zografos [18] utilized the characteristics of ultrasonic reflection waves from surfaces to distinguish pathways. They demonstrated the methods using a dataset acquired from a moving platform, and the method accomplished a high accuracy of different pathways experimentally. Bystrov et al. [19, 20] studied the identification method of pavement material based on ultrasonic and radar, and the recognition of pavement material was achieved by using this method. González et al. [21] introduced a material identification method called Peniel and experimentally validated the Peniel method using ultrasonic sensors. Ultrasonic waves are not sensitive to color, illumination, or electromagnetic fields [2224]. Additionally, when the ultrasonic wave propagates, it has strong directionality, can concentrate energy easily, and can carry information about the transmission medium [25, 26]. The material recognition method based on ultrasonic has the merit of high precision, simple operation, and strong versatility, which could accurately achieve the goal of material identification [27].

Therefore, inspired by the predatory strategies of animals and the good properties of ultrasound, ultrasonic technology is applied to the task of material recognition for robots in this paper. A novel noncontact material recognition method of surrounding objects for robots based on ultrasonic echo signals is proposed, which can be used in external extreme environments. This approach mainly utilizes the 16-dimensional feature vector extracted from intrinsic mode functions (IMF) that we gain from empirical mode decomposition (EMD) as inputs of different machine learning algorithms (i.e. K-nearest neighbor (KNN), support vector machine (SVM), and decision tree) to recognize materials (i.e. wood board, paper board, and foam board), and its acoustic theoretical model is also built. The test experiment is carried out on a robot, Autolabor2, and the experimental results verify the accuracy of the acoustic theoretical model and the effectiveness of the proposed method. Compared with the existing methods, the proposed method improves the low accuracy and the poor recognition effect in ultrasonic material recognition. This method offers a new idea for robots to recognize and perceive materials in extreme environments.

The paper is organized as follows: Section 2 discusses the holistic procedure of the proposed method and also briefly introduces its principles. Section 3 describes the experimental setup and also gives a brief analysis of the experimental data. In Section 4, we validate the proposed method with different classification methods. The optimal machine learning algorithms for the method are determined with a comprehensive comparison. The experimental results are analyzed and combined with the acoustic theoretical model, which is compared with referred literature equally. Section 5 presents the conclusion of this paper.

2. Methods

2.1. The Overall Process of Robot Material Recognition Experiment

Figure 1 shows the typical extreme harsh environments with complex and dangerous conditions, which are dark, dusty, and filled with poisonous gases. In these extreme environments with complex and dangerous conditions, robots may collect incorrect external environment information, and cannot accurately identify the surrounding environment, consequently [12, 13]. Cameras are very sensitive to the light intensity in the environment and have strict requirements on the lighting conditions. When the lighting conditions are not ideal, especially in dark conditions, the specific information of the target object cannot be accurately obtained [10]. Lasers do not disperse easily over long distances, cannot easily bypass obstacles, and is susceptible to dust during their propagation. Therefore, lidars perform poorly in harsh, such as dusty, snowy, and hazy, environments [11]. However, in such extreme scenarios, robots still need to acquire accurate environmental information to perform normal work as usual.

Therefore, how to recognizing the materials and robustly differentiating them in the harsh environment becomes more and more important. In nature, bats are capable hunters, and they hunt with remarkable speed and precision. Bats use ultrasonic signals to perceive obstacles, recognize prey, and even track prey under extreme environments. They can use pulsed ultrasonic to accurately distinguish different obstacles, companions, and prey. Most bats typically inspect targets with many pulses and may thereby collect sufficient information to assemble a representation of a target’s general physical features. Many cetaceans use echo signals to detect different prey or obstacles using ultrasound produced by their vocal system in harsh environments [1416]. Therefore, inspired by these phenomena, ultrasonic sensors can be equipped on robots and be applied to material perception in extreme environments. Ultrasonic waves are not sensitive to color, illuminance, and electromagnetic fields; this feature can be applied to harsh environments with darkness, the presence of dust or smoke, strong electromagnetic interference, and toxicity, accordingly. When the ultrasonic wave propagates, it has strong directionality, easy to concentrate energy, and easy to carry information about the transmission medium [2226]. We propose a noncontact material recognition method of surrounding objects for robots based on ultrasonic echo signals, which can be used in external extreme environments, and the method is validated experimentally on the robot, Autolabor2. The process is demonstrated in Figure 2. The material recognition process mainly contains the following steps: first, the industrial personal computer (IPC) sends an instruction to the robot so that the Autolabor2 stops when it reaches the target position. Second, the STM32 high-frequency ultrasonic board is used to impose excitations to the transducer (T) to transmit a broadband pulse signal. And then we utilize the transducer (R) to receive the echo signal and collect them through the industrial personal computer. Finally, we adopt the proposed method to recognize the material of the surrounding object and return the result to Autolabor2 for further processing.

2.2. The Proposed Method about Material Identification

The specific steps of the proposed method in this paper are presented in Figure 2. First, we gather a large amount of data and perform the EMD method on it to decrease the noise. We analyze the time domain, the frequency domain, and the power spectrum of IMF1 components gained after EMD processing. Then, we compute the 16 selected feature parameters about the IMF1 components to form a 16-dimensional feature vector. Operating the same procedure for diverse materials, we then acquire the feature vector dataset that needs to be classified. Ultimately, we carry on different machine learning algorithms on this feature vector dataset to decide the best classifier and comprehensively analyze the results in combination with the acoustic theoretical model.

In this experiment, we make the transducers paralleled with the ground and fixed on Autolabor2. We control Autolabor2 to stop directly in front of the measured object, and the ultrasonic waves thus can be shot vertically on the object to be measured, consequently. A simple schematic diagram of its principle is displayed in Figure 3. The main experimental setup consists of a transmitter (T) that emits ultrasonic pulses and a receiver (R). The device is constructed in analogy to a bat, with (T) and (R) being equivalent to the bat’s mouth and ears. From references, we can know that bats can use ultrasonic pulses to achieve accurate discrimination of different prey. Therefore, we utilize this device to mimic the pulsed ultrasound signal of bats, and we make the transducer T emit short-time ultrasound pulses with a frequency of 200 kHz.

What’s more, we model and analyze the proposed method through acoustic theory. When ultrasonic waves are vertically injected from one medium to another, the distribution of sound energy (i.e. sound pressure and sound intensity) and the change of propagation direction will comply with certain laws [25, 2830]. At the interface between different materials, a part of the ultrasonic wave is reflected into the original medium, which is called the reflected wave, and the other part of the ultrasonic wave propagates through the interface into another medium, which is called the transmission wave. The schematic diagram of the acoustic theoretical model is shown in Figure 4.

Following the analysis of the acoustic theoretical model, we can derive the relationship between sound pressure reflectivity and acoustic impedance at normal incidence from the wave equation. In Equation (1), where denotes the sound pressure reflectivity, represents the reflected sound pressure, represents the incident sound pressure, represents the acoustic impedance of air, and represents the acoustic impedance of the measured material.

From Equation (1), we can observe that the reflected sound pressure at the interface varies greatly between various materials. The magnitude of the reflected sound pressure is closely correlative with the acoustic impedance. Moreover, the acoustic impedance is closely related to the properties of diverse materials, and its expression is shown in Equation (2). The acoustic impedance is parameterized by two parameters: the propagation speed of sound waves in the material (i.e. ), and the density of the material (i.e. ). Through this acoustic theoretical model, we could conclude that the ultrasonic echo signal could be used for differentiating materials.

Owing to the noise in the ultrasonic echo signal, it is difficult to apply this signal directly for accurate materials identification. Hence, we need to improve the signal–noise ratio (SNR) of ultrasonic echo signals. The advantage of the EMD method is that this method decomposes the signal according to the time scale characteristics of the data itself. Compared with other methods, EMD does not need to preset basis functions and has great superiority in dealing with nonstationary and nonlinear data. The IMF components generated by the EMD method have a high SNR, and the noise is greatly reduced [31, 32]. As a result, we could pick the high-order IMF components gained by EMD processing as the data source, and compute their eigenvalues to identify the material.

From the acoustic theoretical model, we could draw the conclusion that the ultrasonic echo signals of diverse materials have distinct parameters in different spectral domains. Statistical analysis is one of the simple and classical time domain analysis methods. The commonly used time-domain statistical parameters are peak value, arithmetic mean, variance, root mean square, and so on, and these statistical parameters could be picked as the feature parameters to complete the accurate identification of various materials [33]. The reflected acoustic pressure of the ultrasonic echo signal is also involved with energy, frequency, and other factors. In addition to the distinction in the time domain, the extracted feature parameters from the frequency domain and power spectrum could make a significant contribution to distinguishing the echo signals of different materials. In summary, we selected 16 feature parameters from the time domain, frequency domain, and power spectrum, and a feature vector is generated based on feature parameters. The equations of the opted feature parameters are shown in Table 1.

2.3. Ultrasonic Echo Signal Classification

The algorithm in machine learning can map the data of an unknown category to one of the definite categories to reach the purpose of accurate identification. Common classification algorithms include SVM, KNN, decision trees, artificial neural networks, etc. [34, 35]. In this paper, we utilize SVM, KNN, and decision tree to realize the goal of material identification. We compare and analyze the three algorithms to resolve the best classifier for the proposed method by means of experiment.

SVM is one of the known and most widely used supervised machine learning algorithms, which is mainly used for data classification and regression. The main simple idea of the SVM is to find the best hyperplane that is able to separate the training dataset into different classes. During the training of SVM, the classification algorithm projects the input data into a high-dimensional feature space through a specific nonlinear transformation, and an optimal hyperplane is constructed to separate the different classes with the maximum distance in the high-dimensional space. The SVM algorithm is a small-sample learning algorithm with a solid theoretical foundation. It is simple and has good working performance on ultrasound data [34, 36].

KNN is one of the most basic and simple algorithms in machine learning. The KNN algorithm has the core concept if most of the k-adjacent samples of the sample to be tested in the feature space belong to a certain class, the sample to be tested then also belongs to this class and has the characteristics of this class. The KNN is a simple, nonparametric, and instant-based learning algorithm, and it is also a lazy learning algorithm that depends on statistics. This method is suitable for the dataset with statistical characteristics, and it has achieved certain results in different fields [37, 38].

The decision tree algorithm has a strong learning ability and is widely available in the fields of classification, prediction, and feature evaluation. It is one of the classic algorithms in machine learning. The decision tree algorithm’s final classification results are acquired by specific rules. The algorithm divides the sample data according to the features and recursively generates a tree-structured decision tree, which can be used to predict and classify the data set. A decision tree classifier is constructed by a set of inner nodes and leaf nodes, which denote decision thresholds and predictions, respectively. The decision tree algorithm has the characteristics of fast calculation speed and high accuracy, is suitable for high-dimensional data, and does not need to preassume parameters. This algorithm is suitable for the dataset that we obtained [34, 38, 39].

2.4. Evaluation Metrics

The confusion matrix can be used to tabulate the real sample values and the values predicted from the algorithm, which could express the categories and number of correct and incorrect results clearly. The following metrics are commonly used to evaluate the algorithms: (TP = true positive, FP = false positive, FN = false negative, TN = true negative).

Accuracy indicates the proportion of correctly classified samples to the entire sample. Higher accuracy means more accurate predictions [34, 36]. The calculation formula is shown in Equation (3):

Precision indicates the ratio of correctly classified positive examples to the total number of predicted positive examples [36]. The calculation formula is shown in Equation (4):

Recall indicates the proportion of correctly classified positive examples to the total number of true positive examples [34]. The calculation formula is shown in Equation (5):

We expect both precision and recall to be high, nonetheless, these two metrics are mutually exclusive. For this reason, a compromise comprehensive evaluation index F1-score is adopted, which combines precision and recall [36]. The calculation formula is shown in Equation (6):

3. Experiment

To verify the feasibility and accuracy of the proposed method, we conduct material recognition experiments on the robot, Autolabor2. The main device of the experiment is shown in Figure 5, including A transducer (T) that transmits ultrasonic pulse signals, a transducer (R) that receives signals, the STM32 high-frequency ultrasonic board that drives the transducer (T) to work, an industrial personal computer, and Autolabor2. Figure 6 shows the target materials that need to be discriminated. In this experiment, paper board, foam board, and wood board are picked as the target materials to be distinguished, which frequently appear in extreme environments, e.g., construction, field, and rescue environments. And these three materials are flammable and have certain dangers. Therefore, it is important to be able to accurately identify these three materials in extreme environments. The dimensions of three different materials’ boards are all 50 cm × 50 cm with a thickness of 1 mm.

The ultrasonic transducer (T) is a normal airborne ultrasonic transducer with an operating frequency of 200  10 kHz and an operating blind zone of 25 cm. In order to avoid the operating blind zone of the transducer and gain a more accurate and effective ultrasonic echo signal, Autolabor2 is parked 35 cm away from the measured material directly. In the experiment, after the robot arrives at the designated position, the STM32 high-frequency ultrasonic board is used to drive the transducer (T) to emit a 200 kHz broadband pulse signal with a duration of 0.15 ms repeatedly. We utilize the transducer (R) to receive the ultrasonic echo signal and acquire the signal with 1,400 sampling points and a 500 kHz sampling frequency by use of the industrial personal computer. We collect the same amount of data for each material to avoid an imbalanced dataset. After the dataset is generated, different classification algorithms are performed on this dataset, and the results are analyzed.

Figure 7 shows the time-domain waveforms of ultrasonic echo signals of different materials. From this figure, we could observe that the diversity in the time-domain waveforms between different materials is little, and the differences in the characteristic parameters are small. When the ultrasonic echo signal’s frequency domain and power spectrum are further compared and analyzed, we find that the eigenvalues of these spectral domains are not obvious. It is, thus, difficult to execute accurate identification of the material using the characteristic parameters of this ultrasonic echo signal. The signal contains very weak useful information, and the SNR is low, therefore, we adopt the EMD method to achieve noise reduction on the ultrasonic echo signal. After EMD processing, we gain its IMF components with different orders as well as the residual components. Taking the ultrasonic echo signal of the wood board as an example, the time-domain waveforms of the IMF components with different orders and residual components are shown in Figure 8. The time-domain waveforms of IMF components are very similar to the ultrasonic echo signals. In addition, the main frequency range of IMF components with different orders varies greatly. IMF1 has the highest major frequency ranges and the rest IMF components’ main frequency ranges decrease in sequence. We analyze the power spectrum, frequency domain, and time domain of the IMF components with different orders and discover that the central frequency ranges of IMF1 components are 150–200 kHz, which are also the frequency ranges that need to be focused on. Consequently, we calculate characteristic parameters of IMF1 components from the time domain, the frequency domain, and the power spectrum. A 16-dimensional feature vector is generated based on the feature parameters. After collecting a large amount of data about the wood board, the paper board, and the foam board, we perform the proposed method on this dataset and obtain a final feature vector set as an input of different algorithms. The parallel coordinate diagram shown in Figure 9 demonstrates the value of the final feature vector set.

4. Results and Discussion

After attaining the feature vector set, we input it into different classification algorithms (i.e. SVM, KNN, and decision tree) for a comprehensive comparative analysis to choose the best classifier. In this paper, in order to make the material recognition results more accurate and reliable, we employ the 10-fold cross-validation (CV) procedure to divide the dataset into training subsets and validation subsets randomly. The training subset is used to find the appropriate parameters of the algorithm, and the testing subset is used to evaluate the performance of the algorithm [40]. After that, we normalize the data to ensure the feature vector has the same value range and speed up the gradient descent to find the optimal solution. To determine the optimal hyperparameters of the algorithm, we implement a grid search on some hyperparameters [41]. When grid search is performed on the SVM algorithm, the kernel function is chosen as Gaussian or linear. When the kernel function is chosen as Gaussian, the range of the objective function penalty coefficient C is [0.001, 0.1, 1, 10, 100], and the range of the kernel function coefficient sigma is [0.1, 1, 4.1, 16]. In the grid search of the KNN algorithm, the distance metrics are Euclidean distance, Chebyshev distance, and Manhattan distance, and the values of K neighbors range from 1 to 60. In grid search for decision tree algorithm, the classification criteria are information gain and Gini coefficient, and the max number of divisions takes values in the range of 1–30. The optimal hyperparameters of different algorithms are shown in Table 2.

The confusion matrix of different classification algorithms is shown in Figure 10. From this figure, we could find the phenomena that the recognition accuracy of foam board is high, in contrast, the recognition accuracy of paper board and wood board is low, furthermore, it is not easy to make a distinction between the wood board and paper board. We combine the previous acoustic theoretical model for analysis and calculate the acoustic impedance of different materials according to Equation (2). The results are presented in Table 3. We can find that the acoustic impedance values of wood board and paper board are similar, and the acoustic impedance values of foam board are smaller and quite different from the other materials, which likely gives us an explanation for the previous phenomena. The conclusion of the theoretical analysis is consistent with the experimental results in a certain condition, which also could prove the correctness of the acoustic theoretical model.

We get the average values with their standard deviations of the F1-score and accuracy about the experiments dataset under different algorithms, and the results are shown in Table 4. From the table, we can conclude that all three algorithms have good results, and their F1-score and accuracy are all greater than 0.9, quantitatively. KNN has the smallest F1-score and SVM has the largest F1-score. The classification performance of SVM is better than the other methods, with the best results. The accuracy of different classifiers varies greatly. KNN has the lowest accuracy, only about 91.9%, while the decision tree algorithm has a high accuracy, reaching 94.6%. The classification effect of SVM is the best, and its accuracy can achieve 97.3%.

All three classification algorithms could accomplish the goal of material identification accurately. The KNN has the worst performance and results. The occurrence of this circumstance probably dues to the ambiguity in the determination of k-values and the fact that factors such as outlier data and noise information have a greater impact on the KNN algorithm. The SVM has the best performance for this proposed method. According to the analysis of the acoustic theoretical model, there is a strong nonlinear relationship between material properties and the ultrasonic echo signal. The SVM has a stronger nonlinear high-dimensional spatial mapping capability, and it can achieve better differentiation for this nonlinear feature vector set, which could explain the fact why SVM is the best performance for this proposed method [34]. Among these classification algorithms, SVM has the best comprehensive classification effect, and we choose it as the best classifier.

The experimental results of our method in this paper are compared with the relevant literature, as shown in Table 5. The accuracy of the proposed method in this paper is 97.3%, which is slightly higher than the method proposed by Bystrov with an average accuracy of 94.3% and is slightly inferior to the Peniel method proposed by González with an accuracy of 100%. The method proposed by Bystrov extracts features mainly from ultrasonic echo signals and radar signals jointly. However, in extreme environments where radar may fail, the accuracy of this method may be reduced. The method proposed by González requires two ultrasonic sensors to tightly clamp the measured material from both sides. The material recognition speed of the Peniel method may be slow. Therefore, the Peniel method may not be appropriate for the material recognition for robots in extreme environments.

5. Conclusion

In this paper, ultrasonic technology is applied to the task of material recognition for robots, which is derived from the predatory strategy of echolocation animals. A novel noncontact material recognition method of surrounding objects for robots based on ultrasonic echo signals, which can be used in external extreme environments. This method primarily adopts the 16-dimensional feature vector extracted from IMF that we gain from the EMD method as inputs of machine learning algorithms to recognize various materials, and we use KNN, SVM, and decision tree algorithms on the feature vector set to decide the best classifier, and its acoustic theoretical model is built.

Additionally, the test experiment is implemented on the robot, Autolabor2. The experimental results present that KNN, decision tree, and SVM reach 91.9%, 94.6%, and 97.3% accuracy, respectively. Among the three algorithms, SVM has a higher accuracy, a better comprehensive performance, and it is the best classifier for this proposed method. The experimental results verify the accuracy of the acoustic theoretical model and the effectiveness of the proposed method. Compared with the existing methods, the proposed method improves the low accuracy and the poor recognition effect in ultrasonic material recognition. This method provides a new idea for robots to recognize and perceive materials in extreme environments.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors gratefully acknowledge the financial support provided by the Key project of the National Natural Science Foundation of China (no. 92048202) and the Key Research and Development Program of Shaanxi Province (2020ZDLGY01-10HZ).