Research Article

DeepVariant-on-Spark: Small-Scale Genome Analysis Using a Cloud-Based Computing Framework

Table 1

Comparison of variant calling results of DeepVariant and DeepVariant-on-Spark with different combinations of CPUs/GPUs.

Variant calling pipelineVariant typeCPUaGPUbF1cRecallPrecisionTrue positiveFalse negativeFalse positiveGenotype mismatchTotal number of SNV calls

DeepVariantSNP1600.999400.999370.999433040855192817443633886287
3200.999400.999370.999433040856192717443633886337
6400.999400.999370.999433040856192717443633886366
9600.999400.999370.999433040855192817443633886339
1610.999400.999370.999433040855192817443633886287
1640.999400.999370.999433040855192817443633886287
3220.999400.999370.999433040856192717443633886337
6440.999400.999370.999433040856192717443633886366
DeepVariant-on-Spark3200.999400.999370.999433040856192717443633886403
6400.999400.999370.999433040856192717443633886403
12800.999400.999370.999433040856192717443633886403
3220.999400.999370.999433040856192717443633886403
6440.999400.999370.999433040856192717443633886404
12880.999400.999370.999433040856192717443633886403

DeepVariantIndel1600.961680.957110.96628478265214321737311151868527
3200.961680.957110.96628478265214321737311151868535
6400.961680.957110.96628478265214321737311151868520
9600.961680.957110.96628478265214321737311151868535
1610.961680.957110.96628478265214321737311151868527
1640.961680.957110.96628478265214321737311151868528
3220.961680.957110.96628478265214321737311151868535
6440.961680.957110.96628478265214321737311151868520
DeepVariant-on-Spark3200.961680.957110.96628478265214321737311151868541
6400.961680.957110.96628478265214321737311151868541
12800.961680.957110.96628478265214321737311151868541
3220.961680.957110.96628478265214321737311151868542
6440.961680.957110.96628478265214321737311151868542
12880.961680.957110.96628478265214321737311151868541

aCPU means the number of CPU cores. bGPU means the number of NVIDIA Tesla P100 GPUs. cF1 means F1 score calculated by .