Research Article
An FPGA-Based Hardware Accelerator for CNNs Using On-Chip Memories Only: Design and Benchmarking with Intel Movidius Neural Compute Stick
Table 10
Performance comparison between Xilinx FPGAs, Intel FPGAs, and NCS.
| | Device | fclk (MHz) | Inference time (ms) | Total power (W) | Energy (mJ) |
| | Xilinx FPGA families | | Artix 7 | 47.6 | 0.94 | 1.043 | 0.98 | | Kintex-7 lv | 48.2 | 0.93 | 0.969 | 0.90 | | Zynq-7000 | 67.8 | 0.65 | 1.387 | 0.90 | | Virtex 7 | 63.5 | 0.71 | 1.351 | 0.96 | | Virtex-US | 78.4 | 0.57 | 1.861 | 1.01 | | Virtex-US+ | 104.2 | 0.43 | 2.141 | 0.92 | | Zynq-US+ | 116.4 | 0.39 | 2.259 | 0.88 | | Intel FPGA families | | Cyclone V | 31.4 | 1.43 | 2.301 | 3.29 | | Stratix V E | 57.4 | 0.78 | 3.757 | 2.9 | | Stratix V GS | 60.3 | 0.74 | 4.010 | 2.96 | | Arria 10 | 61 | 0.73 | 1.002 | 0.73 | | Stratix V GX | 80 | 0.56 | 3.385 | 1.9 | | Intel movidius neural compute stick | | NCS | 600 | 10 | 0.810 | 8.1 |
|
|