Review Article
The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 Applications
Table 2
-test table of principle captured classes by RL method types.
| | Principle captured | Markov decision process | Multiarmed bandit | Dynamic | Temporal difference | Value function approximation | Policy gradient | Multiagent | Edge computing |
| | Prediction, forecasting, estimation, planning | 1.96 | −0.58 | 0.12 | −0.32 | 3.08 | 0.31 | −0.39 | −4.17 |
| | Detection, recognition, prevention, avoidance, protection | −3.09 | 1.51 | −1.11 | −0.17 | 0.9988, 0.9782, 0.97850.39 | 0.9935, 0.8769, 0.87852.2 | −1.38 | 1.65 |
| | Evaluation, assessment | −0.82 | 0.63 | −0.08 | −0.62 | −1.21 | −0.6 | 2.96 | −0.27 |
| | Classification, clustering | −3.61 | 0.56 | 1.28 | 1.41 | 1.54 | 0.65 | −2.88 | 1.05 |
| | Decision making | 3.1 | 2.47 | −0.84 | 1.73 | 0.95 | 0.02 | −10.82 | 3.39 |
| | Allocation, assignment, resource management | −2.69 | −1.95 | 0.21 | −11.18 | −4.74 | 2.08 | 10.54 | 7.74 |
| | Scheduling, queuing, planning | 2.17 | −1.33 | −2.17 | 1.24 | 1.61 | −0.04 | −2.63 | 1.16 |
| | Control | 2.67 | −1.34 | 2.35 | 7.52 | −1.72 | −4.93 | 4.27 | −10.82 |
|
|