Abstract
It is important to recognize the motion of the user and the surrounding environment with multiple sensors. We developed a guidance system based on mobile device for visually impaired person that helps the user to walk safely to the destination in the previous study. However, a mobile device having multiple sensors spends more power when the sensors are activated simultaneously and continuously. We propose a method for reducing the power consumption of a mobile device by considering the motion context of the user. We analyze and classify the user’s motion accurately by means of a decision tree and HMM (Hidden Markov Model) that exploit the data from a triaxial accelerometer sensor and a tilt sensor. We can reduce battery power consumption by controlling the number of active ultrasonic sensors and the frame rate of the camera used to acquire spatial context around the user. This helps us to extend the operating time of the device and reduce the weight of the device’s built-in battery.
1. Introduction
Recently, the mobile devices are equipped with a variety of sensors, such as a GPS receiver, an accelerometer, a gyro sensor, and a camera, for recognizing the user’s motion and environment. Efficient utilization of these sensors has therefore been studied [1–3]. However, one of the difficult issues is the residual time of battery in the mobile device when it activates several sensors continuously. Some of the sensors in the mobile, such as the camera, spend lots of battery power. Therefore, the power saving method for effectively using the sensors is required.
Another issue is the difficulty of extracting precise data from the sensors in the mobile device. Accelerometers and tilt sensors in particular are used to detect the motion context, which means relationships between the motions of the user during a certain period of time. In addition, it involves motion scale analysis and direction of the user’s motion. However, detecting the exact motion is not easy because the data extracted from the sensors can be noisy and determining the motion features such as deviation and mean is difficult.
We propose the method to detect motion of the user by extracting more accurate data and to save the power by activating sensors efficiently. In order to reduce the operating frequency of the sensors consuming a lot of power, we activate the sensors only if you need to use sensors by analyzing the user’s motion accurately. We determine the motion of the user by analyzing the data gathered from the accelerometer and the tilt sensor, which are low power consumption and low price compared to others. This method enables us to control the operation of other sensors adaptively. We can thus prolong the operating time of the mobile device and/or decrease the weight and size of its battery. In order to verify the availability of our proposed method, we applied it to a guidance system for visually impaired person that was developed in our previous studies [4]. It is based on mobile device and is used of additional sensors to detect the surroundings. We use a camera to estimate indoor position of the user and multiple ultrasonic sensors to avoid obstacles on the path. The device can save the power consumption about 15% by adjusting the frequency of use of the sensor in accordance with the user’s motion, as compared with the case of activating the sensors consistently. In addition, the system not using the display device can save the power of about 40% in comparison with the case of activating the display device continuously.
Our method utilizes the sensor data processing technique so as to improve the recognition rate and accuracy despite the dynamic movement of a user possessing the mobile device. In addition, the motion recognition accuracy of our method is higher than that of the previous methods which use the data acquired from sensors attached to some parts of the user’s body. The method detects user’s motion with about 90% accuracy because of using specific features such as vertical and horizontal components and applying an HMM-based classifier to improve performance. As a result, the method could precisely detect motion of the user and effectively reduce the power consumption of the system.
We summarize related work in Section 2. In Section 3, we describe our method in detail. In Section 4, we present the experimental results applying the proposed method. We conclude our study in Section 5.
2. Related Work
In general, motion context refers to the activity pattern of a user as analyzed using the data extracted from sensors attached to some parts of the user’s body. Kern et al. [5], Krause et al. [6], Ravi et al. [7], Choudhury et al. [8], and Karantonis et al. [9] researched human activity and context awareness using several accelerometer sensors. They analyzed the motion of the user with only data of accelerometer sensors. Those methods have no orientation problem for collecting data by attaching a sensor to a specific location on the body. However, our method reveals the orientation problem because it collects direction data from a mobile device. It is necessary to extract orientation-independent features that reflect the current position of the device, regardless of the orientation of the mobile device. A solution to avoid orientation problem is using magnitude of the accelerometer’s each axis. Mizell [10] has shown that the average on each axis over a time period can produce an estimation value of the gravity-related component. We use a similar approach to estimate the gravity component from each axis of accelerometer sensor.
In the analysis the data of accelerometer sensors, methods for identifying user motion generally use a classifier, such as a decision tree and a GMM (Gaussian Mixture Model). Huynh and Schiele [11] categorized activities such as walking, writing, or sitting using an SVM (Support Vector Machine), and Long et al. [12] used a decision tree to classify a variety of human motions. Husz et al. [13] applied an APM (Action Primitive Model) that analyzed the motion using supervised learning, and Nakata [14] classified the activities by means of an approximate HMM (Hidden Markov Model). Zhu and Sheng [15] used an HMM for analyzing motion data extracted from accelerometer sensors attached to the hand or foot. However, methods using classifiers require additional processing to improve the accuracy and much training data to yield correct classification.
Mobility is an important factor for mobile devices because the power is continuously supplied from their battery. The device must be usable for long periods using a battery of small capacity. To reduce the power consumption of these devices, several methods have been devised to minimize the use of the CPU and the display [16–18]. The methods use other systemic energy optimization techniques so that the overall battery life of the device is increased [19–21]. However, they have problems that the response time of the device is delayed and the performance is degraded.
In this paper, we exploit the accelerometer sensor and tilt sensor embedded in the mobile device simultaneously to evaluate motion of the user and apply a decision tree based on approximation HMM for accurate analysis of the motion in real time. The proposed method can reduce power consumption because it minimizes CPU computations by controlling the frame rate of a camera and the number of active ultrasonic sensors used for recognizing the context of the user’s surroundings, without loss of performance such as the processing speed. In other words, the method can save the power by adjusting sensors adaptively in a mobile device based on the motion recognition of the user.
3. Power Consumption Control Method
We propose a method for reducing power consumption by adjusting the frequency of the use of active sensors applied for context awareness. The proposed method consisted of two stages, motion analysis and power control. First, the method takes advantage of the motion context of the user derived from the accelerometer sensor and the tilt sensor. The motion of the user is analyzed in terms of the acceleration data for , , and axes obtained from a triaxial accelerometer sensor. In other words, the motion analysis is conducted with some features such as vertical or horizontal acceleration components of user’s action. In addition, we use the tilt sensor to correct errors in the data generated in accordance with the mounted position of the accelerometer sensor and the walking style of the user. To analyze accurately the motion context from both sensors, we apply an HMM-based decision tree which is a classification technique applying the time series method.
Depending on the result of this motion analysis, we determine the frequency of use of active sensors, which consume a lot of power in the system. It is to determine the minimum number of ultrasonic sensors required to be active and the minimum frame rate for the camera. Then, the recognition accuracy in that case should be similar to the accuracy in the case when using all sensors. By activating the necessary sensors only in special situations, rather than activating all sensors continuously, it is possible to reduce power consumption and to extend the battery life of the device.
We present an overview of the proposed method in Figure 1. It comprises two stages, namely, analyzing the motion context with the HMM and controlling the power consumption according to the identified situation, via the activation of specified sensors only.
3.1. Motion Context Estimation Using Accelerometer Sensor
We analyze the motion and orientation of the user by means of mobile device’s built-in triaxial accelerometer and tilt sensor. However, it is not easy to detect the motion directly from those sensors’ data. Accurate motion recognition is difficult because some of the data may be lost or may contain noise [22, 23]. We therefore use probabilistic inference to construct a Weka Toolkit based decision tree using an HMM classifier that exploits both current data and previous data [24, 25]. We can analyze a variety of motions with the data extracted from the sensors. However, we focus on five motions such as Standing, Walking, Fast Walking, Ascending Stairs, and Descending Stairs. In addition, we want to choose three motions (Standing, Walking, and Fast Walking) based on the walking speed of the user. Three motions require significantly different amounts of power to activate sensors needed for context awareness [26, 27].
We acquire acceleration data for the device in the -axis, -axis, and -axis directions from the triaxial accelerometer sensor. However, the data are erroneous because of jittering noise, even if the device has been placed on a table. To reduce the jittering noise, we scale down the acceleration data by applying an MAF (Moving Average Filter), as given by (1). Here, , , and are the raw data and , , and are the scaled-down data. The factor defines the number of data according to the sampling time interval, and indicates the span value for smoothing. This smoothing technique for noise reduction can be applied to both mobile and stationary devices
Orientation problems may occur, because every person has a different gait and the mounted position of the accelerometer sensor is variable [7, 13]. To solve this problem, we use the magnitude values from the sensor as well as the orientation-independent features such as the standard deviation and the mean. These are obtained from the vertical and horizontal components of accelerometer sensor. At this stage, we have determined the sampling period for calculating each value via repeated experiments. Let the acceleration vector at any point be and let the mean value for each axis be , , and . We then define a reference vector , which is normalized from . As described in (2), we derive the horizontal vector from and the vertical vector , which is multiplied by and . means a scalar value being the inner product of and where
We evaluate the horizontal and vertical components by means of estimating horizontal and vertical vectors. A horizontal magnitude defines and a vertical magnitude uses . To determine the parameters used in the classifier, we estimate features such as mean, standard deviation, 75% percentile range, and zero crossing rate, computed from the waveform of magnitude. To gather sufficient training data, acceleration data are collected from test users over about four hours. Each person carries out the three motions (Standing, Walking, and Fast Walking). We use a C4.5 decision tree that is known to increase the recognition accuracy by increasing the number of samples [28]. The tree classifier involves the features of the motion, as the mean and standard deviation of the vertical and horizontal components of the acceleration. We define meanV and stdV as the vertical features and meanH and stdH as the horizontal features. We generate a well-pruned decision tree (shown in Figure 2) based on -means clustering for matching similar motions. However, there is a limit to recognize two motions (Walking and Fast Walking) if using only decision tree because the motions show various changes of complicated patterns according to the time. Therefore, in order to improve the motion classification accuracy, we create sequence data by collecting classification results of predetermined length obtained from a decision tree. Then, we use approximation HMM which is a classification technique applying the time series method. In other words, we employ the Viterbi algorithm based on an HMM to maximize the utilization of the correlation between continuous motions [29, 30].
3.2. Adaptive Power Control via Motion Context
To verify the effectiveness of the proposed method, we implemented a prototype system that acquired the user’s spatial context using a variety of sensors. To reduce power consumption, we controlled the frame rate and the number of active sensors based on the motion context. The prototype system could detect an obstacle in the user’s path via six ultrasonic sensors. To recognize objects in front of the user, it is important to arrange the sensors efficiently to cover the maximum range with the minimum number of sensors based on each sensor’s physical characteristics, such as its coverage and the detection range. In addition, the sensors should detect obstacles quickly and precisely. Therefore, we estimate the geometric information for all sensors and determine their optimal placement via repeated experimentation [31]. As depicted in Figure 3, we simplify the spatial structure in front of the user by classifying it as one of several predefined patterns. We then determine an avoidance direction by evaluating the pattern to minimize the probability of collision with the obstacle. As shown in Figure 4, we set each sensor’s direction and coverage to overlap as little as possible with those of neighboring sensors, by considering the walking speed of the user and the sensing rate of the sensor [32].
We consider the range data extracted from four ultrasonic sensors and represent the spatial information in terms of patterns in front of the user. The range data are classified into four cases: danger (less than 100 cm), warning (100~130 cm), adequate (130~200 cm), and unconcern (more than 200 cm). We can identify 256 (= 44) cases and can generate the corresponding range data from the four sensors in each case. All cases are stored in a table (see Table 1). Each number denotes one of the four cases, namely, 0 (danger), 1 (warning), 2 (adequate), and 3 (unconcern). The avoidance instructions are classified into some cases, namely, turn-left, turn-right, and forward. The avoidance direction for the obstacle can therefore be determined by referring to the table.
As shown in Figure 5(a), if the motion is recognized as Walking or the user proceeds straight ahead, we can deactivate the two sensors that sense spatial information to the left and the right of the user. This is because four sensors for detecting frontal space can detect obstacles placed to the left and the right of the user if the walking speed is average. It is therefore possible to reduce power consumption by selectively activating the sensors that are arranged in the same direction as the walking direction of the user. As shown in Figure 5(b), when the motion is perceived as Fast Walking or an obstacle is detected, we have to acquire spatial information to the left and the right of the user to avoid obstacles. In addition, it is necessary to analyze the frontal space precisely for Fast Walking. We therefore have to activate all ultrasonic sensors. This will enable us to detect an obstacle accurately even if the user walks fast.
(a)
(b)
In addition, we attached identifying markers to the ceiling at regular intervals to enable tracking of the position of the user via camera recognition of the markers. We increase the camera’s frame rate for accurate recognition of the markers when the motion is recognized as Fast Walking and minimize the frame rate when the motion is perceived as Walking, as shown in Figure 6. The method can reduce the required battery power by decreasing the frame rate, while maintaining the detection accuracy, when the motion is recognized as Walking.
4. Experimental Results
4.1. Motion Patterns Analysis
It is very important to correctly classify the various human motions. We conducted experiments to compare the accuracy of several classifiers to detect specific motion from the input data, such as mean and standard deviation of the horizontal and vertical components obtained from the accelerometer sensor. We compare and analyze four classifiers: decision tree (DT), naïve Bayesian (NB), -nearest neighbor (NN), and logistic regression (LR) based on probabilistic inference techniques. A window size of the classifiers is set as 100 samples collected in the same duration for five motions: Standing, Walking, Fast Walking, Ascending Stairs, and Descending Stairs. Table 2 shows the accuracy of classification applying each classifier. As shown in the results, all the classifiers well sorted standing motion, but they showed lower accuracy for ascending stair motion in comparison to the other motions. A decision tree well classified all the motions compared to other classifiers. Therefore, we determine to use a decision tree as a motion classifier.
We construct a C4.5 decision tree, generated by the Weka Toolkit, which is known to be a relatively accurate method even with a small number of training samples [28]. We perform the training and execution phases of a process that detects motion. In the training phase, we collect users’ motion. We calculate the mean and standard deviation of the horizontal and vertical components of the acceleration values continuously over a predefined period. We then generate the decision tree using samples and test data [4]. The accuracy of recognition increases with increasing and ; however, in our experiments, we have obtained high accuracy even with small sample spaces. We identify three motions depending on the walking speed of the user: “Standing (0 km/h),” “Walking (less than 3 km/h),” and “Fast Walking (less than 5 km/h).” Also, the experiment includes results of Ascending Stairs and Descending Stairs. In the execution phase, the current motion is determined by exploring the decision tree. We can achieve accurate motion classification by periodically checking the horizontal and vertical components and by transferring only accurate values to the decision tree.
The size of the sample space is an important factor required for the decision tree. To determine a suitable value, we carried out experiments that measured the accuracy of the motion classification and the tree search time for various factor values. We collected a training data set and generated a decision tree having [33]. Table 3 reports the accuracy of motion detection and the tree search time for various values of . The accuracy increases as the size of the increases, but the classification computation time increases. In other words, the classification computation takes more time if is larger. In addition, the search time for recognizing the motion is proportional to . Therefore, we design the tree by considering the trade-off between the accuracy of motion detection and the motion recognition time. From these experiments, we determined that should be 50, because the results show that the detection accuracy of all motions is high and the computation is completed in 0.25 seconds, that is, the motion detected sufficiently accurately at the lowest cost.
We consider the number of active ultrasonic sensors and the sampling rate of the camera, which can be controlled according to three motions (Standing, Walking, and Fast Walking) requiring significantly different amounts of power. In case of the Standing state, we do not supply power to the ultrasonic sensor and the camera. When the motion is recognized as Walking, we activate only four ultrasonic sensors to detect obstacles in front of the user, and we capture the image as a frame rate of about 3 fps. When the motion is perceived as Fast Walking, we activate all ultrasonic sensors and operate the camera at its maximum frame rate (5 fps). Through a number of experiments, we determined the optimal number of active sensors and the sampling rate for the camera depending on the situation, aiming to maximize the accuracy of the motion detection and minimize power consumption. We constructed a confusion matrix from a decision tree using 10,000 samples. We present the results in Figure 7. We confirmed that the number of sensors and the frame rate of the camera changed adaptively according to the motion of the user, as shown in Figure 7.
4.2. Accuracy Measurement
We evaluated the performance with five randomly selected students aged between 20 and 40 and four visually impaired persons. The users were not familiar with the experiment and the students were blindfolded. The obstacle placed on the path was a box (about 20 cm wide). We determined the optimal marker size as 12 × 12 cm, considering the distance from the ceiling to the ground and the camera viewing angle. If the obstacle was detected, the user was required to walk until hearing the message “Stop.” A scan using the six sensors required about 125 ms and the latency was set to 400 ms (the time between detecting an obstacle and providing feedback to the user). We determined that this latency offered sufficient time to react to any motion change by the user, via repeated experiments.
As shown in Table 4, we measured the detection rate for the obstacle for various numbers of sensors. When the motion was recognized as Walking, the detection rates were 94% and 97% for four and six active ultrasonic sensors, respectively. The experimental results were similar for both cases. However, for the case of Fast Walking at speed 67% faster than Walking, the detection rate for four active sensors is reduced by about 40% compared with six sensors. We therefore need to activate only four sensors (to reduce power consumption) during Walking. However, we should operate all sensors during Fast Walking, if we aim to maintain similar accuracy in both cases.
Figure 8 shows the rate of detection of the markers for different camera frame rates. We measured frame rates from 1 fps to 7 fps. However, we focused on three frame rates (2, 3, and 5 fps) that showed high accuracy and saved substantial power over repeated experimentation. The accuracy at 3 fps is similar to that at 5 fps if the motion is recognized as Walking. However, the detection rate at 3 fps is substantially higher than that for the other frame rates. We therefore use 3 fps during Walking because it requires less power than the other frame rates, while offering similar accuracy.
4.3. Power Consumption Measurement
We measured the relative power consumption by setting a time slot (10,000 samples) and considering three patterns (activating six ultrasonic sensors, four sensors, and no sensors). Figure 9 shows the power consumption for various numbers of active ultrasonic sensors. We evaluated the power consumption from the current and voltage of the battery. The power consumed is equal to the product of voltage and current. Therefore, the power consumption is affected by the number of active sensors because of the sensor current. From the experimental results, about 450 mA was required if there were six active sensors, compared with 350 mA when none were activated. We can therefore reduce power consumption by controlling the number of active sensors based on the motion while maintaining the accuracy level required for obstacle detection.
Figure 10 shows the power consumed for various camera frame rates. We evaluated the relative power consumption by setting a time slot (10,000 samples) and considering three cases (sampling at 2 fps, 3 fps, and 5 fps). When sampling at 5 fps, about 17% more battery power is required than when sampling at 3 fps. As shown in the figure, the frame rate of the camera affects the current it requires and hence its power consumption. We can control the frame rate of the camera adaptively according to the user’s motion to reduce power consumption while maintaining detection accuracy.
We measured the amount of power consumption in two cases: when and when not applying the motion context. The case applying the motion context is defined as SAS (Selectively Activating Sensor) and the case not using the motion context is defined as FAS (Fully Activating Sensor). Each experiment was carried out in a simple path including obstacles and in a congested path having long walking distance. As shown in Figure 11, when the motion is recognized as Walking in a simple path, the system consumed less power about 15% than when it was recognized as Fast Walking. In addition, it showed a power reduction of about 18% compared to the case of FAS not using motion context. In a congested path, when the motion was perceived as Walking, the system spent less power about 12% than when it was recognized as Fast Walking. Furthermore, there was a power reduction effect of about 20% compared to the case of applied FAS. Therefore, we could verify the availability of the proposed method through experiments showing that there is a relative power saving of approximately 15% or more compared to that without using motion context.
5. Conclusions
In this paper, we analyzed the motion context of a user of a mobile device using data from its triaxial accelerometer and tilt sensor. We found that we could reduce the device’s power consumption by controlling the number of active sensors and the frame rate of the camera used to acquire data about the spatial context, based on the user’s identified motion. This enables the use of the device for an extended time and a reduction of the weight and size of the device, because it should be possible to reduce the capacity of the battery without excessively compromising performance. As future work, we are working on applying the proposed method in the smartwatch as one of the mobile devices. The proposed method can be applied in various mobile devices with 3-axis acceleration sensor and save the power by controlling the activation of sensors embedded on the mobile device.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (Grant no. HI14C0765). This work was supported by INHA University research grant.