Research Article

Designing Deep Learning Hardware Accelerator and Efficiency Evaluation

Table 3

The comparative evaluation of the existing and the proposed implementation schemes.

Experimental platformCPUGPUFPGA [6]FPGA (proposed)

Platform configurationi5–10400FGTX 1660TiV6-690TXilinx Kintex-7
Data typeFp32Fp32Fix16Fix16/Fp32
Clock frequency (MHz)430018451818
Execution time (s)176.23.920.3
Energy consumption (W)6512025.623.3
Throughput (GOPS)1.359117.441.3276.19
Energy efficiency (GOPS/w)0.02090.9781.653.27
Speedup40.988.67