Abstract
Workers in material handling tasks often suffer from work-related musculoskeletal disorders (WMSDs) caused by inaccurate work postures or the lifting of excessively heavy loads. Therefore, effective ergonomic assessment of workers is needed to improve worker productivity while reducing the risk of musculoskeletal disorders. This paper proposes a noninvasive method for evaluating posture risks and load analysis in manual material handling tasks. The study focuses on three main aspects: first, using 3D pose recognition technology to extract the 3D coordinates and joint angles of the human body. Second, the REBA method was improved by using fuzzy logic theory to more effectively capture the slow transition features of continuous movement by humans without abruptly altering risk scores, as well as to increase the accuracy and consistency of posture risk evaluation. Third, joint torque and workloads were estimated using biomechanical calculations by integrating pressure insoles and 3D joint coordinate data. Experiments show that this method can effectively evaluate posture risks and workloads in manual material handling tasks, with a correlation coefficient of 0.817 () between fuzzy logic REBA and REBA and an error rate of 15% in estimating workloads of eight joints. This method can help reduce occupational health risks for workers and industries and improve work efficiency.
1. Introduction
Manual material handling (MMH) operations are commonly found in many industries such as manufacturing, logistics, and construction. Due to the repetitive and awkward nature of these tasks, workers often experience discomfort and overexertion issues [1, 2], which can lead to work-related musculoskeletal disorders (WMSDs). Especially in developing countries, many industries cannot meet the demand for automation, and most productive work is still done through semiautomatic means. Since WMSDs have early symptoms that are not always obvious, the onset of the disease is often delayed. Workers frequently ignore them. Continuous improper working posture not only endangers workers’ health but also results in decreased productivity, reduced production capacity [3], and economic losses. Therefore, effective risk assessment and load analysis methods must be developed for MMH operations to prevent WMSDs and improve worker health and safety.
Effective human-machine ergonomic evaluation to avoid ergonomic risks in the work process is currently the mainstream research direction for reducing the incidence of WMSDs.
Manual observation methods and wearable inertial sensors are widely used in ergonomic postural risk assessment, and although these studies have demonstrated the feasibility of the methods, they still have some limitations [4, 5]. The results of manual observation are subjective, and different observers may give different assessments based on the same posture. Wearable sensors can capture body joint information more objectively and accurately, but this approach requires multiple sensors on the operator, which can interfere with normal work and is invasive.
Therefore, more practical and effective methods are needed to assess posture and load risks in MMH operations [6]. With the development of computer vision technology, it is now possible to recognize human posture by using human pose estimation technology to capture images or video frames of people at work. After obtaining the worker’s pose skeletal data, WMSD risk assessment can be conducted using risk assessment rules. This approach is currently a trend in machine vision-based human-machine engineering risk assessment [7]. However, there are still some issues to be addressed. Most of the current research is directly applying the rules of the traditional postural risk assessment method to the vision-based risk assessment system. The joint angles recognized and calculated by posture are more accurate, while the range of motion of the joint angles of the traditional risk assessment method goes to give a generalized score, so there are some problems in combining the traditional assessment method with the vision-based risk assessment. How to improve the traditional assessment method to be able to more accurately conduct operational risk assessment and make it more suitable for vision-based risk assessment is the current research gap. Also, the vision-based risk assessment is only applicable to activities with a large amplitude of joint movement; for some activities with a more fixed posture but at the same time cause a large load on the body (material handling, etc.), only the elbow joint has a large amplitude of movement, so the overall risk score will be low, but the activity of the human impact is large. Therefore, it is possible to analyze the joint loads by combining the known joint point information from the posture estimation with the external loads, i.e., to assess the operational risk from another perspective. Therefore, combining human postural risk and joint load assessment is meaningful and can improve the whole ergonomic posture risk assessment more.
In this study, we propose a novel method that combines human pose recognition, pose risk assessment, and biomechanical load analysis to evaluate the posture risk and joint loads during manual material handling (MMH) operations. The present method enables accurate and noninvasive monitoring and evaluation of workers’ body posture while taking into consideration the dynamic aspects of loads, risks, and posture during the job process. The following are the study’s primary contributions:(1)The proposed approach can accurately identify the workers’ full-body posture, improving the visual recognition performance under occlusion conditions(2)The traditional REBA risk assessment method is improved to avoid the problem of sudden changes in REBA assessment scores due to changes in joint angle inputs, making it more suitable for machine vision-based job risk assessment(3)A method combining computer vision technology and sensors is proposed to monitor the joint torques and workloads through a new noninvasive means
The remainder of this paper is organized as follows: In Section 2, we present related work on ergonomic posture risk assessment as well as biomechanically-based human load assessment and further summarize the gaps in the current research area. Section 3 presents models for human posture estimation, transformations for joint angle calculation, fuzzy logic-based REBA risk assessment, and finally, biomechanically-based load estimation. Section 4 describes the whole experimental environment as well as the experimental flow and experiments on the comparison of joint recognition accuracy by IMU sensors with the proposed method. In Section 5, we further carry out experiments on the comparison of REBA with the improved REBA and experimental results on the joint torque and joint loading evaluation, as well as discuss the superiority of the proposed method, and finally, Section 6 summarizes the work of the paper.
2. Related Work
2.1. Current Status of Ergonomic Posture Risk Assessment
Ergonomic postural risk assessments (EPRA) are commonly used to identify potential risks of WMSDs such as poor posture and repetitive movements [8]. These methods rely on on-site observation or video examination of joint angles between body parts, such as OWAS [9], NOISH [10], REBA [11], and RULA [12]. The Rapid Entire Body Assessment (REBA) is a typical assessment method that reduces the risk of WMSDs by evaluating the degree of loading of postures and movements on the body of the workers and then taking preventive measures according to the different risk levels. The REBA assessment process is divided into four parts: observation: observe the postures and movements of the workers, with special attention to the position of their limbs and the amplitude of their movements; scoring: use the REBA scale to assign corresponding scores to each body part and movement according to different body parts and movements, assigning a corresponding score to each body part and movement; analysis: calculating the overall score based on the scores and determining the risk level of the postures and movements based on the scoring results; and recommendations: suggesting corresponding improvements based on the assessment results to reduce the physical loads on the workers and reduce the risk of injury. However, manually completed EPRA may lead to several problems. First, EPRA results may be affected by the researcher’s perspective and fatigue. Second, manual observation and assessment are more subjective and time-consuming.
Therefore, current research has seen the emergence of new techniques to replace manual assessment. These techniques fall into two categories: contact sensor methods and methods for noncontact vision. Different methods of risk assessment of body postures, both invasive and noninvasive, are shown in Table 1. The contact sensor approach consists of attaching sensors to the subject to collect musculoskeletal and motor data during work [13, 19]. Common sensors include motion capture systems and inertial sensors. This approach allows for an objective assessment of WMSDS risk [20]. However, as an invasive method, workers may be hindered from wearing sensors while working, and the high cost of various testing instruments and the time-consuming testing process can only be analyzed and tested in the laboratory, which makes it difficult to be widely applied in actual production activities. Therefore, the noncontact vision method has a wider application value. Using a single RGB camera for image recognition in human-machine risk assessment has become a current research trend [21–23]. Li et al. [18] proposed a real-time estimation method of RULA based on a deep neural network for two-dimensional joint pose, which first uses 2D action recognition to identify the skeleton points of the human body before using a 3D neural network to identify the positions and vectors of each joint. Next, by projecting the human body joints onto a sagittal plane in a similar way to projection, the angles between the joints are calculated to perform the RULA evaluation. Lee and Lee [17] proposed a human-machine engineering risk assessment system SEE, which combines the convolutional pose machine (CPM) method with a fast full-body assessment method. It can capture the overall human body posture for ergonomic risk analysis and only requires the input of posture video frames or images captured by a single camera. The system can also be used to develop WMSD risk assessments based on smartphones. Wang et al. [24] proposed an approach for predicting work-related musculoskeletal disorders (WMSD) that integrates three artificial intelligence algorithms and utilizes dynamic characteristics of working posture. A posture risk assessor examines the working posture’s danger level frame by frame, while a posture detector detects the angles and states of the limbs. A task risk predictor is also used to forecast the risk level of the present work process.
2.2. Current Status of Biomechanically Based Human Load Assessment
Biomechanical analysis is a method of measuring the load on the human body by evaluating joint forces or torques. It simplifies human joint activity as a hinge linkage mechanism. The joint torque is estimated by mechanical calculations and based on the human joint position, anthropometric characteristics, and external loads [25]. Workload evaluation has been performed in a variety of contexts using biomechanical analysis, including physical and cognitive workloads [26, 27]. Theoretically, biomechanics encompasses all critical components of workload evaluation, including intensity, repetitiveness, external load duration, and posture. Almost any action or body component may be studied using biomechanical analysis, including reiterated motion and static external load [28]. However, there is a gap between theoretical analysis and actual application, mostly because motion and external load data collection techniques are inexact. Although observation has been employed extensively to gather motion data, its results are thought to be too arbitrary and unreliable to support biomechanical research [29, 30]. Therefore, we need more accurate and automated data collection methods. Kim et al. [31] proposed a method to estimate human joint torque changes in real time while performing a large number of manipulation tasks by collecting joint position information and ground reaction force through experimenters wearing motion capture suits and standing on a force plate. However, wearable motion capture suits will affect people’s normal work, as well as the limited movement of people standing on the force plate may not be suitable for practical applications. In addition, human-machine simulation software such as OpenSim and anyone can perform muscle-driven dynamic simulations, providing a viable approach to analyzing the force and torque of elements in the musculoskeletal system and assisting in the evaluation of human load [32–34]. Human load assessment using computer vision is also a trend in research, as exemplified by the theoretical method proposed by Yang et al. [35] to analyze video tracking postures using a biomechanical model. By representing the biomechanical skeleton of the human body and evaluating workload and joint torque quickly and accurately based on joint rotation angle, the method can be used for work-related tasks. Kong et al. [36] proposed a 3D biomechanical model based on computer vision technology to study workers’ mechanical energy consumption by approximating the working posture coordinates of human joints with a 2D video-based human body 3D pose estimation algorithm and using smart insoles to collect foot pressure and acceleration as input data for biomechanical analysis. The total maximum daily consumption of building tasks can be approximated through tasks such as walking, lifting, and bending. Currently, a new type of foot pressure sensor called Moticon is applied in biomechanical load analysis. The sensor can be used with almost any shoe, and smart insoles on the market can transmit data through wireless ANT services. Afari et al. [37] demonstrated the use of smart insoles to obtain the foot pressure of construction workers.
To summarize, most of the current machine vision-based operational pose risk assessment approaches directly apply the assessment logic of traditional analytical assessment approaches (e.g., RULA and REBA) directly to pose risk assessment. However, there is a serious problem with this approach, i.e., the sensitivity of the traditional assessment methods to the input variables is very low. For example, with REBA, 45° and 90° of flexion do not affect the upper arm score. On the contrary, a risk score of 2 when the torso is flexed 20° and 3 when the torso is flexed 21° may result in two different final values of REBA, which may affect the risk scoring level, as highlighted by previous studies [38, 39]. Thus, there are some problems in directly integrating the traditional observational assessment method into machine vision-based risk assessment. There is a need to improve the operational posture of wind assessment hair to make it more suitable for risk assessment via machine vision. Our approach is to perform operational risk assessment in a noninvasive way and to improve the traditional REBA approach by introducing the fuzzy logic method, so that the results of the assessment will not increase or decrease abruptly after the input of joints. The method is able to reflect the gradual transition characteristics of critical angles during human movement without sudden changes in risk ratings. Currently, in the biomechanical load assessment research, most of the studies are conducted through wearable motion capture suits, force plates, and other invasive devices to collect and calculate biomechanical loads, which will restrict the normal mode and range of human activities and are difficult to be applied to actual production, as well as some biomechanical assessment through human-computer simulation software that does not have real-time performance. Therefore, we obtain the human reaction force by means of the pressure insole, and calculate the joint torques and joint workloads by combining the human joint position information and joint angles collected in the risk assessment part, and combine the two parts of the assessment with each other to make the overall ergonomics assessment more complete.
3. Methods
This research proposes a machine vision-based postural risk assessment methodology designed to capture the ability to monitor, assess, and predict the level of risk of work-related cumulative musculoskeletal disorders (WMSDs) from job videos and from pressure sensor data. The general architecture of the proposed method is shown in Figure 1:(1)A monocular camera is used to capture video frames of a worker during operation, and then 2D human pose estimation is utilized to obtain 2D joint point information of the human body. The 3D pose estimation method is then utilized to predict the position of each joint point of the worker in 3D space and finally calculate the limb angle.(2)The joint angles are input into the FREBA model for risk assessment. The model uses a fuzzy logic approach, which includes the steps of constructing the affiliation function, fuzzification, fuzzy reasoner, and defuzzification.(3)Pressure sensors are placed inside the shoes of the operator to obtain the pressure data of the bottom of the foot during the operation, and the external load data are obtained by subtracting their own gravity. At the same time, the three-dimensional information of each joint point of the operator at this time is combined with the joint torques of the body parts, and finally, the joint workload is calculated according to the load-bearing capacity of the joints.

3.1. Human Pose Estimation Model
We collected workers’ job video frames and used an open-source toolkit called MMpose to obtain their pose information. MMpose is an open-source framework based on PyTorch, which provides a set of reusable models, datasets, and tools for training and evaluating human pose estimation models. The framework utilizes state-of-the-art deep learning algorithms and has been extensively tested on multiple datasets, demonstrating good performance in the field of human pose estimation.
We used a top-down pose estimation recognition method and processed the collected video frame sequences using three different neural network models, as shown in Figure 2. The pose recognition model is divided into three modules. First, we used the faster-RCNN network [40] as the object detection layer to capture the worker’s position information. Next, we used the HRNet [41] network model to recognize the 2D human body pose keypoint data of the worker and extract 16 key points. Finally, we used VideoPose3D to convert the 2D keypoint prediction results into 3D coordinates, predicting the position information of each keypoint of the worker in the 3D coordinate system. VideoPose3D [42] employs the Hourglass network, which enables simultaneous consideration of both the local area and the entire image features. This network establishes a link between pixels or sets of pixels and multilayer neural networks, resulting in a comprehensive description. The entire pose recognition model was trained on the COCO dataset, aiming to transform the input video into human body key points and expand to the relative 3D position of each joint.

After identifying the 3D pose of the human body, we can extract the corresponding 3D positional data of each joint. However, in order to evaluate the posture using REBA, we also need to calculate the joint vectors and their specific angles using trigonometric functions. The joint coordinates are defined as and , and the limb is constructed by the adjacent key points and . The limb is composed of the adjacent key points and , with representing the limb. The absolute angle of limb is determined by the expression :
During posture estimation, certain limb angles may be influenced by other body parts (e.g., the exact angle of the forearm may be impacted by the upper arm’s position). As a result, the relative posture angle of the forearm concerning the upper arm must be calculated. The formula used to calculate the relative angle between limb and limb is given by
By calculating and converting the joint angles we, as shown in Figure 3, can see the positions of the different joint angles represented in the human skeletal points, which were calibrated at 11 joint angles in order to be used in the REBA postural risk assessment.

3.2. Fuzzy Logic REBA Risk Assessment
The proposed fuzzy logic REBA assessment is based on the REBA method proposed by Hignett and Mcatamney [11]. REBA is an observational method to estimate whether work poses a potential health hazard by scoring the postural action ergonomics assessment method. However, the traditional REBA cannot be directly integrated into the risk assessment of machine vision. While the joint angles estimated by posture recognition techniques are very precise, even a small adjustment of 1° or even 0.001° in the joint angle can lead to sudden changes in the integer risk rating when input into REBA.
The input joint angle section necessitates the creation of affiliation functions for all model variables. These fuzzy membership functions enable the mapping of a collection of objects X in the range [0, 1], allowing for numerical calculations in afterward fuzzy inference processes. In the context of the REBA method, the body is divided into two main sections. The neck, trunk, and legs make up the first part, and Table A [11] in the REBA worksheet aggregates their respective scores. The second part of REBA encompasses the upper arm, forearm, and wrist, with their scores combined with the REBA worksheet’s Table B [11]. Six sets of membership functions were established for the corresponding body parts in REBA, as depicted in Figure 4. In this study, the joint angles were fuzzified using trapezoidal functions, and the REBA intermediate scores were fuzzified using triangular functions. The final REBA score was also obtained using triangular membership functions, while trapezoidal membership functions are used as follows:

The triangular membership function is
For joint angles, adjacent membership functions are set to 0.5 to allow for a gradual transition between variables, where a, b, c, and d in formulas (3) and (4) are all real numbers. A fuzzy rule-based system was created, while three sets of 240 rules were developed based on if-then statements. Figure 5 shows the scores for the neck, legs, and trunk based on the rule set A. For example, if the neck score is 2, the leg score is 3, and the trunk score is 3, then the overall score for all three is 5.

Defuzzification is the process of converting fuzzy values into precise and crisp values and can be understood as a mapping from the fuzzy space to the crisp space. Defuzzification is the last step of a fuzzy logic system. In this study, a highly commonly used and reasonable method, the centroid method, was adopted for defuzzification. The centroid method takes the centroid of the area bounded by the membership function curve and the horizontal axis as the final output value. The calculation formula is
Formula (5) represents the membership value of the output variable, and the maximum membership subset is selected based on the principle of maximum membership value. Fuzzy logic theory can generate stable transition results for joint angle changes. The final evaluation score also gradually transitions, which avoids sudden changes in REBA evaluation scores and improves the method’s reliability.
The evaluation is completed using the Fuzzy Logic Toolbox in MATLAB based on the determined input variables, output variables, their membership functions, and fuzzy rules. Finally, scores related to activity types are added. Ultimate REBA scores range from 1 to more than 11, with higher scores indicating a greater risk of WMSDs. Table 2 shows these scores and their corresponding action levels.
3.3. Load Estimation Based on Biomechanical Analysis
In this study, a novel foot pressure sensor called Moticon was utilized to measure the total weight of workers, which includes both their own weight and external pressure. The insole can be attached to almost any type of footwear and wirelessly transmit data through the ANT service. The insoles are fitted with 26 pressure sensors (13 per insole) to determine the average pressure in the corresponding area. Equation (6) illustrates how one accomplishes this by multiplying each sensor’s pressure by its area to determine the insole’s overall ground response force.where are the ground reaction forces of the left and right feet. is the plantar contact area of each foot ( = 150 cm2). is the largest sensor number. is the number of each sensor; are the pressure values of the left and right feet. The pressure sensor was zeroed before the start of the experiment, and the OpenGo auto-zeroing mode was always active. It is based on an algorithm that continuously checks the sensor zero position and compensates for sensor offsets and drifts. Sensor offsets and drifts that may occur due to shoelace and temperature variations. We use the OpenGo App for sensor calibration. Individual calibration of the pressure sensor zeroing reduces the total force error to less than 5%.
In the current study, human joint torque can be estimated by combining pressure insoles and computer vision [36, 43]. We collected the operator’s ground reaction forces and from the pressure insole, combined them, and subtracted our own gravity from them to obtain the external loads. Next, we utilize a pose estimation technique to obtain joint positional information and angles and compute the torques at the end-segment joints (left and right wrist and ankle). Then, we calculate the torque of other non-end-segment joints in turn, and the whole process is shown in Figure 6. Finally, the work load of each joint is estimated according to the maximum load capacity of different joints. Typically, the external load is located at the hands and feet (ground reaction forces) during handling tasks. Both dual-arm and single-arm working styles are taken into consideration when assessing the force at the hands. The working method is identified by comparing the angles of the left and right shoulder and elbow joints. The working style is categorized as two-handed if the angles of the left and right arms are the same. Otherwise, the working method is determined to be the single-arm working method. The external load on the worker’s dominant hand also needs to be considered. The mass of the particle is assumed as with corresponding weight . The total weight of the worker is denoted as and its corresponding mass is . There are 16 joint points in the human skeleton diagram shown in Figure 2, thus and . The external load force is assumed as , and its mass is denoted as .

After detecting the workers’ postures (the three-dimensional coordinates of the body’s joints) in Section 3.1 and measuring the pressure data using pressure insoles, we calculated the torque of each joint using biomechanical analysis and Newton’s laws of motion. For biomechanical analysis, the human skeleton can be simplified to a hinged linkage structure, with the bone corresponding to the lever and the joint representing the hinge. This simplification was utilized for calculations. During the analysis, it was presumed that the main joints’ movement of the worker’s body was steady and unhurried and that the joints were in a state of equilibrium.
The human body’s joint torque is mainly generated by the muscles surrounding the joint under the muscle torque arm. The analysis methods vary according to the position of the joint in the body segments and are mainly divided into end-segment joint torque and non-end-segment joint torque. End-segment joints mainly refer to the joints that connect to only one body segment of the human body, while non-end-segment joints mainly refer to joints that connect to multiple body segments of the human body. For a nonterminal body segment , let the position of the far-end joint of the segment be , the position of the near-end joint be , the mass of the segment be , and the center of mass be located at . To calculate the joint torque at the near-end joint of the segment, it is necessary to determine the joint torque at the far-end joint of the segment. A static equilibrium torque equation can be established at the near-end joint of the segment:
The vector represents the gravitational force acting on the body segment at its center of mass, as shown in Figure 7 for the elbow joint.

For the distal segment , assuming the proximal joint position is the mass of the segment is , and the position of its center of mass is . The segment bears the external force of the external weight , and the position of the center of mass of the external weight is . Therefore, a static equilibrium torque equation can be established at the proximal end of the segment i to calculate the joint torque , as shown in the following formula:
The vector represents the gravitational force acting on the segment at its center of mass, and represents the gravitational force of any external weight the segment is bearing. Therefore, the static equilibrium equation for joint torque can be derived as follows:
The vector represents the external torque applied to the body segment:
This section aims to evaluate the workload of a worker based on joint torques. It takes into consideration that individuals possess varying load-bearing capacities, and as such, the evaluation of workload should factor in both external elements like external loads and postures as well as the worker’s load-bearing capacity. To measure human biomechanical capabilities, Maximum Voluntary Isometric Contraction (MVIC) is a widely accepted indicator. The National Isometric Muscle Strength Database Consortium [44] developed a regression equation based on over 500 experiments to predict MVIC using factors such as gender, age, height, and weight. This equation is employed to estimate joint capabilities.
The subjects are identified as male = 1 and female = 0; a, b, c, and d are coefficients with values shown in Table 3; is the maximum torque that the joint can withstand (N/m); is the torque arm length (m) when measuring the external load; and since the joint angle in the experiment is a right angle, the torque arm is equal to the length of the corresponding bone. Age, weight, and height are measured in years, kilograms, and meters, respectively.
The joint workload can be calculated on the basis of the existing joint torque τ (N) and the maximal load capacity (N):
4. Experiment
4.1. Instrumentation
A total of 6 male volunteers (age: 23.4 ± 1.0 years, height: 1.78 ± 0.17 m, and weight: 70 ± 2.7 kg) and 4 female volunteers (age: 22.8 ± 1.2 years, height: 1.62 ± 0.05 m, and weight: 51 ± 2.7 kg) were recruited for this study. All volunteers who participated in the experiment did so voluntarily and had satisfactory physical fitness. They did not exhibit any symptoms of musculoskeletal diseases. In addition, they were capable of completing the handling task independently. Before the experiment, all volunteers completed personal information and informed consent forms. A laptop computer with an Intel (R) Core (TM) i5-12490F 3.00 GHz CPU and an NVIDIA GeForce RTX 3060 Ti GPU running Windows 10 was used to run the relevant code and algorithms developed in this study. A smartphone (iPhone 11) was used to collect video data of the handling process. Two pressure sensors (OpenGo and Moticon GmbH) were placed inside the volunteers' shoes to synchronously collect the load data of the people during the handling task, while a tripod was used to keep the smartphone in a fixed position.
4.2. Experimental Setting and Procedure
The experimental setting is shown in Figure 8, and all experiments were conducted at the same time of day to ensure consistent sunlight intensity using a fixed light source. The camera was fixed at a distance of 3 m from the sagittal plane and 1.25 m above the ground. During the experiment, participants were asked to transport a 10 kg box, simulating the process of material handling. The experiment consisted of four parts: (1) participants walked to the box on the right, (2) participants squatted to lift the box, (3) participants walked with the box to the platform on the left, and (4) participants placed the box on the platform. After completing one set of activities, participants rested for 30 seconds before repeating the experimental task, and then a different participant performed the experiment. Each person completed two sets of experiments, resulting in a total of 20 sets of video frame data and OpenGo sensor data being collected.

4.3. Data Collection
The three-dimensional pose estimation method was employed to accurately extract the three-dimensional joint positions from the video frames. For each frame, the resultant data are a 16 × 3 matrix containing the accurate three-dimensional coordinates of the 16 joints. The video clips were meticulously divided into 600 frames, each lasting approximately 20 seconds, and underwent thorough three-dimensional pose estimation to obtain the precise joint’s 3D coordinates. The extracted angles were subsequently calculated and skillfully utilized for accurate fuzzy Rapid Entire Body Assessment (REBA) analysis.
The pressure data collected were utilized for evaluating the weight of the worker and other loads on them. Intelligent shoe insoles were used to assess total weight or ground response force. The pressure data from each sensor was recorded to determine the average pressure in the corresponding area. Figure 8 shows the visual distribution of the pressure values of the left and right shoe insoles during the process of carrying and lifting the cardboard box by the subjects. The pressure measurements from each of the 13 pressure sensors on the right shoe insole are displayed in Table 4. 30 frames per second (fps) of video data were captured, while the shoe insole pressure data were recorded at a frequency of 50 frames per second.
4.4. Results
The experimental process of the volunteers carrying boxes is shown in Figure 9, which includes the recording of 8 keyframes of motion, corresponding to 2D joint estimation skeletal images and estimate of 3D joint coordinate images. Our shooting angle allows the sagittal plane of the volunteers to be clearly photographed while also observing the right half of the volunteer’s body. The recognition results show that even when there is occlusion of the right half of the volunteer’s body relative to the left half and when there is some occlusion of the left arm during the process of carrying the box, there is no recognition error in the 2D joint recognition. Therefore, the obtained 3D joint coordinates match the coordinates of the volunteer’s actual activity.

The 3D pose estimation method in this paper was compared to the IMU sensor measurement method for recognition accuracy. The experiment was the same as before, containing subjects completing four operational tasks. In the experiments, subjects wore inertial motion capture devices while video recording was performed to capture joint positions, and a total of 600 video frames were captured. We were able to acquire data from 16 body joints from pose recognition, while wearing the IMU sensor only captured the position information of 14 joints (ignoring the neck and buttocks joints information). The experiment procedure is shown in Figure 10.

First, we compared the joint point positions estimated from the 3D pose in this paper with the joint positions measured by the IMU sensor, as shown in Figure 11(a). The errors of most frames fall within the range of 2.25 cm to 4.75 cm, with an average error of 4.78 cm and a standard error of 1.47 cm. Next, we compared the joint angle values computed based on the method in this paper and the data from the IMU sensor, as shown in Figure 11(b). More than 470 video frames had errors in the range of −5° to 5°, with an average angular error of −0.66° and a standard error of 9.25. Only a very small number of video frames had errors in the range of −20° to 20°, which was attributed to the fact that the limbs overlapped and occluded heavily at certain moments, resulting in large joint angle errors, usually in the right and left knee joints. Overall, the method in this paper has higher accuracy compared to IMU measurements. In addition, the method in this paper does not need to build an experimental environment for IMU sensors and does not interfere with the operator’s normal work, which makes it more portable.

(a)

(b)
5. Discussion
5.1. Comparison and Discussion of Experiments between REBA and Improved REBA
Figures 12 and 13 show the REBA scores and risk levels of the subjects during a 20-second (600 frames) moving process. The data were calculated by averaging the risk assessment results of 10 volunteers. We compared two evaluation methods: the traditional REBA scoring method and the improved REBA evaluation method based on fuzzy logic, which we refer to as FREBA in the figures. In the REBA score chart, the traditional evaluation method uses a total score, resulting in a stepwise ladder line in the chart. As joint angles change with time during the lifting process, when one or more joints reach the scoring threshold, the REBA score fluctuates. The FREBA score obtained after fuzzy processing is a decimal, hence the smooth curve in the chart, which is more accurate than traditional REBA evaluation and avoids the problem of fluctuating scores. In Figure 13, the risk level ladder line of the traditional REBA evaluation fluctuates due to the influence of the score, while the FREBA evaluation result is stable without fluctuations and better matches the posture and risk status of the subjects at the time.


Table 5 presents the joint angles, REBA scores, FREBA scores, and risk levels of participants in four missions. In mission 1, participants performed normal walking without any risk. In mission 2, participants squatted down to lift a cardboard box, resulting in high scores due to the high hip joint and thigh bending angles. Therefore, it is necessary to adjust the current action as soon as possible and minimize the carrying work at a very low point in the future mission. In mission 3, participants carried the cardboard box while walking, resulting in a relatively low score, but the lower arm was kept at around 90° and needed appropriate adjustments. Meanwhile, the gravity of the load and the upper arm load will be analyzed in the following text. In mission 4, participants lifted the cardboard box onto a higher platform, resulting in a shoulder joint angle of 107.4°, an elbow joint angle of 22.8°, and a risk level of 3, indicating the need to complete the mission as soon as possible. The table also compared the results of traditional REBA and FREBA scores, with a Spearman correlation coefficient [45] of 0.817. The Spearman rank correlation coefficient critical value table showed that with a sample size of 6, if the correlation coefficient r is greater than 0.727, there is 99% confidence that two random variables are related. Therefore, at a confidence level of , the method can effectively improve the REBA evaluation method and has high reliability.
5.2. Discussion of Experimental Results for Joint Torque and Joint Load Assessment
We obtained the torque data of 10 volunteers moving boxes through the use of pressure insoles and biomechanical calculations and averaged them. The torque data in the figure did not take into account the directional factors and can be regarded as absolute values. Figure 14 includes eight joint torque curves and indicates the periods and torque values of four keyframes for different missions. In mission one, the elbow and shoulder joint torques of the volunteers did not change significantly, but the hip and knee joint torques showed periodic fluctuations due to leg movements during walking. When volunteers performed mission two, squatting to lift boxes, the torques of various joints significantly increased. The knee joint torque increased to 80 N/m, and the hip joint torque increased to 70 N/m, corresponding to the REBA score and risk assessment level we evaluated. In mission three, when volunteers carried boxes while walking, the knee and hip joint torques maintained the wave-like pattern of walking torques, but the torques of each joint were relatively higher than those in mission one, which was caused by the external workload of lifting the box. In mission four, due to the need to lift the box to a higher platform for placement, the torque values of the left and right shoulders of the volunteers also increased to a high point of 28.16 N/m, after which the entire lifting process was completed. Through the changes in the torque curves, the torque situation of each joint of the volunteers during the entire lifting process can be intuitively observed, and these data can be further studied as the working load of each joint.

According to the physiological information of volunteers and formula (11), we calculated the joint capacity of volunteers and the joint workload ratios corresponding to the four missions. By combining Table 6 and Figure 15, we found that the joint workload ratios of the subjects were highest in mission 2, with workload ratios of the left and right shoulder and hip joints approaching 30%, which is consistent with the previous risk assessment. Due to the addition of the external workload of the cardboard box in missions 1 and 3, the hip joint workload ratio and knee joint workload ratio increased by about 4%. In mission 4, because it required the person to lift the arm, the workload ratio of the left and right shoulder joints rose to 27.77% and 26.75%, respectively, and other joint workload ratios increased slightly compared to mission 1. The joint workload ratio during the entire handling process did not exceed 40%, which was because the selected cardboard box was relatively light and the volunteers’ motion range was relatively standardized.

5.3. Superiority Compared to Existing Research
Our research has the following advantages: first, we propose a computer vision-based postural risk assessment method, and by comparing the recognition accuracy with wearable IMU sensors, the results illustrate that the method has higher recognition accuracy and better portability. Second, in this paper, we improved the traditional REBA risk assessment method by introducing the fuzzy logic theory, which avoids the sudden change of the overall risk level caused by the input of a specific angle and is more capable of expressing the score transition during the movement of body joints. FREBA is more suitable for the postural risk assessment based on vision technology in the current study. Third, we also included the assessment of joint workload to make the postural risk assessment more complete. The risk assessment based on vision technology in the current study has some limitations. For example, if an operator moves two objects with large mass differences in sequence with an identical posture, the scores of the risk assessment by machine vision are the same, but the fatigue level of the human being is completely different. Therefore, in this study, the above problem can be solved by estimating the workload of different joints through noninvasive pressure sensors. This method is the same as the machine vision method and will not affect the normal operation of the workers. It can be widely used in the construction industry, logistics handling operations, and other practical work scenarios to assess the postural risk of the workers. It can be used to take timely preventive measures to improve the occupational health and work efficiency of the workers.
5.4. Limitations and Future Work
The methodology proposed in this study aims to monitor, assess, and predict the risk level of WMSDs derived from operational videos and pressure-insole data. There are many ways to apply the method in a company, such as postural risk monitoring and assessment of operators at the actual workplace, operational safety training for new operators, and improvement of a certain work process based on the risk assessment results to reduce the potential risk of WMSDs. In practice, however, there may be problems of occlusion by machines or other objects, so it is important to have a good angle of acquisition of the video, and to place the camera as far as possible in the sagittal plane of the operator, so that the posture of the operator is as fully exposed as possible to the field of view of the acquisition. The estimation of joint workload by means of pressure insoles is done by picking up the reaction force of the ground to obtain the external load, so the operator’s feet cannot leave the ground. Leaning or sitting on the ground can cause estimation errors and should be avoided.
The experimental results show that the proposed method in this paper is an accurate, reliable, time-saving, and convenient MSD risk assessment method. This method can not only accurately estimate posture risk but also estimate joint torque and workloads, making it a useful tool for determining risk assessment in actual work environments. However, this study also has certain limitations. More volunteers are needed to represent different labor forces, such as different body types, body mass index categories, and genders. In biomechanical load estimation, we have always assumed that human motion is a uniform process. However, if the worker’s motion is unstable, acceleration can increase joint torque, and the external load estimation method used in this study involves subtracting the pressure detected by the insole from the worker’s own weight, which means that all the pressure on the worker in the work is on the feet and does not apply to situations where the worker is sitting or sharing the weight on the knee. The methodology proposed in this study aims to monitor, assess, and predict the risk level of WMSDs derived from operational videos and pressure-insole data. There are many ways to apply the method in a company, such as postural risk monitoring and assessment of operators at the actual workplace, operational safety training for new operators, and improvement of a certain work process based on the risk assessment results to reduce the potential risk of WMSDs. In practice, however, there may be problems of occlusion by machines or other objects, so it is important to have a good angle of acquisition of the video, and to place the camera as far as possible in the sagittal plane of the operator, so that the posture of the operator is as fully exposed as possible to the field of view of the acquisition. The estimation of joint workload by means of pressure insoles is done by picking up the reaction force of the ground to obtain the external load, so the operator’s feet cannot leave the ground. Leaning or sitting on the ground can cause estimation errors and should be avoided.
To address the limitations mentioned above, future research will include: (1) incorporating acceleration data if available to analyze joint torque through kinematics; (2) developing more precise methods for estimating external load, such as utilizing deep learning algorithms to identify carried objects and using this information to assist in weight estimation; and (3) exploring whether a new quantitative standard can combine posture risk and joint load for human ergonomics evaluation and using more actual work scenarios to train 3D motion estimation algorithms, especially in scenarios where there are high visual obstacles between workers and cameras.
6. Conclusions
This study proposes a noninvasive method for assessing the risk of working posture and analyzing workload by using computer vision algorithms and smart shoe insoles to collect posture data of workers during lifting operations and finally outputs the worker’s REBA assessment data and joint workload data. First, a 3D pose recognition method based on MMpose is proposed and verified to obtain human 3D skeletal data and calculate human joint angles. Fuzzy logic is introduced into the REBA risk assessment to avoid the problem of sudden jumps or drops in the assessment results caused by small changes in joint angle inputs, and the experiment shows that the REBA evaluation method with fuzzy logic fusion is more accurate and reliable. The torque information of each joint is obtained by using inverse kinematics to calculate the human 3D skeletal data and pressure shoe insole data, and the workload ratio of different joints is calculated. This study provides a new approach for practical work posture risk assessment and workload analysis, which can help improve ergonomics based on machine vision.
Data Availability
The data presented in this study are available on request from the corresponding author. The data are not publicly available to protect the subjects’ privacy.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the Science and Technology Foundation of Guizhou Province ([2020]1Y262) and Key Project of Guizhou Provincial Science and Technology Plan (ZK[2023]015).