Abstract
African swine fever (ASF), a contagious viral disease, poses a significant threat to the global swine industry. In South Korea, ASF outbreaks have occurred since 2019, highlighting the need for a comprehensive understanding of the epidemiology and genetic characterization of the circulating African swine fever viruses (ASFVs). We obtained 21 ASFV isolates from domestic pig farms and analyzed their whole-genome sequences using the Illumina MiniSeq. Phylogenetic analysis was conducted using the maximum likelihood and time-scaled approaches to determine the genetic relationships and evolutionary dynamics of the Korean ASFV isolates. Comparative analysis of the 21 ASFV genomes with the reference strain Georgia 2007/1 revealed that while Korean isolates shared 11 mutations, they also had 22 discrete mutations, including single nucleotide polymorphisms and insertion/deletion polymorphisms (Indels). Phylogenetic analysis indicated that all Korean isolates were within the Asian subgroup of ASFV genotype II but were further divided into at least three distinct subclusters. Spatiotemporal analysis indicated multiple introductions of ASFVs into South Korea, crossing the national border with North Korea. In addition, we observed putative self-recombination between MGF 505-9R and MGF 505-10R genes in the ASFV/Korea/Pig/Inje2/2021 strain. Our findings provide insights into the genetic variations and evolution of ASFVs on South Korean pig farms from 2019 to 2021, uncovering multiple introductions of ASFVs across the national border, and highlighting the need for enhanced disease control strategies.
1. Introduction
African swine fever (ASF) is a highly contagious viral disease that affects domestic pigs and wild boars and causes severe economic losses and trade disruptions in the swine industry worldwide. The disease is caused by the ASF virus (ASFV), a large double-stranded DNA virus belonging to the Asfarviridae family. ASFV exhibits a wide range of clinical symptoms and high mortality rates in domestic pigs, with acute forms characterized by high fever, depression, and hemorrhage [1]. ASFV genotyping is based on the analysis of specific genetic markers such as the major protein p72 encoded by the gene B646L and the central variable region (CVR) within B602L [2–6]. ASFV strains have been classified into 24 genotypes, with genotypes I and II found outside Africa [4, 7, 8].
Genotype I ASFV strains were first reported in Portugal in 1957 and have since been detected in several countries, including Spain, France, Italy, Brazil, the Dominican Republic, and Haiti in the 1960s and the 1970s [9, 10]. ASFs due to genotype I ASFV strains have since been eradicated in all countries, except Sardinia and Italy, where they were endemic [11], until China reported genotype I ASFV isolates in 2021 for the first time in Asia [12]. Genotype II strains were introduced into Georgia in 2007, which marked the first outbreak outside Africa, and spread throughout the Trans-Caucasian region and Europe in 2014 [13, 14]. In Asia, China reported its first ASF outbreak in a pig farm in 2018, followed by many countries, including Mongolia, Vietnam, Cambodia, North Korea, Laos, the Philippines, Myanmar, Timor-Leste, Indonesia, and India. Since 2007, most ASFVs reported in European and Asian countries are genotype II [15–27].
In South Korea, ASF was first detected in 2019 on a pig farm in Paju, Gyeonggi Province. By the end of 2021, 21 outbreaks were reported in Korea [28]. In a previous study that analyzed 12 gene markers using partial sequencing [28], all 21 ASFV strains isolated from affected pig farms were found to belong to p72 genotype II, serogroup 8 with intergenic region (IGR) 173R-I329L II and CVR 1 [28]. Notably, no tandem repeat sequence (TRS) insertions were detected in the IGR A179L-A137R and IGR MGF 505 9R/10R, and no variations were observed in the O174L, K145R, MGF 505-5R, CP204L, or Bt/Sj regions among the 21 Korean isolates. In addition, the analyzed genes of these isolates were identical to those of Georgia 2007/1, the Chinese strains Pig/HLJ/2018 and China/2018/AnhuiXCGQ, and the Vietnamese strain ASFV_NgeAn_2019. However, further analysis revealed that X69R, located in the J268L region of the 18th isolate (Korea/Pig/Goseong/2021), had a single tyrosine (Y) insertion at position 209. A previous study [28] stated that this finding implies that there are slight variations in ASFVs circulating in South Korea from 2019 to 2021 and that the source of the virus responsible for the 18th ASFV (Korea/Pig/Goseong/2021)-infected farm was different from those of the other 20 pig farms. However, the detailed epidemiology of ASFV outbreaks in South Korea has not yet been fully determined owing to the low resolution of genomic data.
Recently, next-generation sequencing (NGS) methods have been applied for whole-genome sequencing of viruses, including ASFV [29–32]. To further understand the epidemiology, transmission, and evolution of ASFV in South Korea, in this study, we conducted whole-genome sequencing of 21 strains isolated between 2019 and 2021 using NGS methods. We analyzed genetic polymorphisms among Korean ASFVs and conducted a spatiotemporal transmission analysis using a time-scaled phylogenetic tree.
2. Materials and Methods
2.1. Samples
A total of 21 outbreaks were identified in South Korea between 2019 and 2021 (14 in 2019, 2 in 2020, and 5 in 2021). Three outbreaks, each in 2019 and 2021, were detected through nationwide monitoring of pig farms, which was initiated following the initial ASF outbreak in September 2019. In addition, two positive pigs were detected during testing at the slaughterhouse in 2020, leading to the tracing and confirmation of its origin farm and a neighboring farm. The remaining 13 outbreaks were confirmed based on notifications from pig farmers reporting sick or deceased pigs. Samples (blood or spleen) from all 21 outbreaks were collected for further analysis. Real-time PCR confirmed the presence of ASFV (Table 1).
2.2. DNA Extraction and Real-Time PCR for ASFV Detection
Viral DNA was extracted from the samples using the Maxwell® RSC Total Nucleic Acid Kit and Maxwell RSC Whole Blood DNA Extraction Kit (Promega, Madison, WI, USA) according to the manufacturer’s instructions. The extracted DNA was stored at −20°C until analysis. Real-time PCR targeting the B646L gene encoding p72 for the detection of ASFV genomic DNA was performed using Bio-Rad CFX-96 (Bio-Rad, Hercules, USA) as described in the World Health Organization for Animal Health (WOAH) Manual (World Organization for Animal Health) [33, 34].
2.3. Whole-Genome Sequencing
DNA sequencing libraries for Illumina MiniSeq (Illumina, San Diego, CA, USA) were prepared using the Illumina Nextra XT DNA Library Preparation Kit and the Nextra XT Index Kit v2 Set A (Illumina) according to the manufacturer’s instructions. Target enrichment was performed using an Enzymatic Preparation Kit (Celemics, Seoul, Republic of Korea), and a library was prepared. The prepared genomic DNA library and capture probes were hybridized with the prepared genomic library and capture probes using the Celemics Target Enrichment Kit (Celemics). Capture probes were chemically synthesized to hybridize with the target region, and the captured regions were amplified by post-PCR to enrich the genomic DNA. Before sequencing, library quality was assessed using a BioAnalyzer 2100 (Agilent, Santa Clara, USA) with an Agilent Bioanalyzer DNA High Sensitivity kit (Agilent) and quantified using a dsDNA High Sensitivity Assay kit (Thermo Fisher Scientific) and Qubit 2.0 Fluorometer (Thermo Fisher Scientific). MiniSeq sequencing was conducted in 150 bp paired-end mode using the MiniSeq High Output Reagent kit (300-cycle) kit (Illumina) according to the manufacturer’s instructions.
2.4. Data Analysis
Adapter sequences and low-quality sequencing reads with a quality score below 70 were trimmed using the BBDuk v38.84 program. Taxonomy classification using the KRAKEN2 program (https://ccb.jhu.edu/software/kraken2/) was used to determine the percentage of ASFV genome in the remaining reads. The trimmed sequencing reads were then assembled by performing reference mapping against ASFV Georgia 2007/1 (GenBank Accession number NC_044959) using Geneious Prime software (https://www.geneious.com/). To minimize erroneous mappings, a maximum of 10% mismatch was allowed during the reference mapping process. Consensus sequences were generated considering only sites with coverage depths >20. NGS and assembly data are summarized in Table 2. Owing to the limitations of our short-read sequencing system, we were unable to obtain sequences for the terminal inverted repeat regions at both ends of the genome. The whole-genome sequences are uploaded in GenBank (Table 1).
2.5. Variant Confirmation Using Sanger Sequencing
Conventional PCR was performed using region-specific primer pairs shown in Table S1 and TaKaRa PrimerSTAR HS DNA (TaKaRa, Shiga, Japan) to confirm the variant sequences in the 21st ASFV (Korea/Pig/Inje2/2021) genome. The reaction was performed on a Bio-Rad CFX-96 instrument (Bio-Rad).
2.6. Phylogenetic Analysis
Whole-genome sequences of genotype II ASFVs available in GenBank (https://www.ncbi.nlm.nih.gov/genbank/; data accessed on 01-Feb-2023) were downloaded. Viruses exhibiting an unusually high number of mutations exceeding the typical rate of viral mutations were excluded because of the suspicion of sequencing errors. For efficient computation and visualization, we chose representative sequences from those redundantly reported in the same region and during similar times. In addition, we reduced the number of European ASFV sequences for clearer visualization, based on prior studies [29, 32], because these European viruses are phylogenetically distinct from the Korean isolates. A total of 64 reference genomes, including the Georgia 2007/1 strain, were selected for phylogenetic analysis (Table S2). The 21 ASFV genomes analyzed in this study, along with the reference genomes, were aligned using the Multiple Alignment using Fast Fourier Transform method and manually trimmed to equal lengths with Georgia 2007/1, with approximately 187,420 sites including gaps. G/C homopolymers and inverted terminal repeats, prone to sequencing errors, were excluded.
A maximum-likelihood phylogenetic tree was constructed using RaxML v8.2.7, employing the general time reversible (GTR) nucleotide substitution model. Bootstrap analysis with 500 replicates was used to assess the statistical support of the phylogenetic tree. Georgia 2007/1 was used as the root of the phylogenetic tree.
In addition, a time-scaled phylogenetic tree was constructed using the BEAST v1.10.4 program [35]. An uncorrelated relaxed clock model with gamma-distributed rate (GTR + γ) nucleotide substitution was used. Four Markov chain Monte Carlo runs, each comprising 150 million steps, were run in parallel. The parameters and trees were sampled every 10,000 steps, resulting in 40,000 parameter states and posterior trees. TRACER v1.5 was used to analyze the parameters, with 10% of each result discarded as burn-in [36]. All parameters had an effective sample size of greater than 200. A time-scaled maximum clade credibility tree was generated using TreeAnnotator v1.10.4 (https://beast.community/treeannotator) in BEAST and visualized using FigTree v1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/). The ASFV/Korea/Pig/Inje2/2021 strain was excluded from the time-scaled phylogenetic analysis because of suspected putative recombination. Reference sequences that deviated from the normal mutation rate and considered as errors were excluded.
3. Results
3.1. Whole-Genome Comparative Analysis
Comparative analysis of the 21 ASFV whole-genome sequences with the Georgia 2007/1 reference strain sequence revealed 33 mutations, including single nucleotide polymorphisms (SNPs) and insertion/deletion polymorphisms (Indels), present in the 21 Korean isolates. The Inje2/2021 strain had multiple additional mutations in the MGF 500-9R gene. Of the SNPs identified, 17 were nonsynonymous, five were synonymous, and the remaining 11 were detected in IGRs that do not code for any protein. All Korean isolates shared 11 mutations, of which six were nonsynonymous: T26425C (N329S) in MGF 360-10L, A44576G (K323E) in MGF 505-9R, T134514C (N414S) in NP419L, T170862A (I195F) in I267L, a truncation mutation (stop codon) in MGF 110-1L at position C7059T (W197), and a frameshift deletion causing protein truncation at position 12578 in the ASFV G ACD 00190 gene. In addition, a GAATATATAG insertion was found in the IGR between the I73R and I329L genes in all Korean isolates, indicating that all Korean isolates belonged to the IGR II genotype, based on the TRS classification of the IGR between I73R and I329L. The GAATATATAG insertion was confirmed by Sanger sequencing in previous study [28].
A total of 22 genetic polymorphisms, including SNPs and Indels, were detected among Korean ASFVs; the nonsynonymous mutations are as follows: A2329G (L106P) mutation in MGF 360-1 La gene, G10388A (P114S) in MGF 110-7L, G11277A (A17V) in 285 L, C16649G (V243L) in MGF 460-4L, G23149A (A251V) in MGF 300-4L, G30606A (L268F) in MGF 300-12L, truncation (358 to 288aa) due to C insertion at 33042/3 in MGF 3605-14L, frameshift mutation due to C deletion at 36146 in MGF 505-3R, and C137334T (L506F) mutation in NP868R. The Goesong/2021 strain exhibited a frameshift mutation (70-71 aa) in the X69R gene owing to a CTA insertion at position 20405/20406. Two ASFVs detected in wild boars in Korea in 2019 and 2020 were included in the mutational investigation. The YC1/2019 strain (accession number, ON075797) detected in wild boars in the region near the demilitarized zone (DMZ), the national border between South and North Korea, in 2019, showed the same mutations as three ASFVs: Yeoncheon1/2019, Paju2/2019, and Paju5/2019 [37]. The HC224/2020 strain (accession number, OP628183) had identical mutations as YC1/2019 but had one more mutation, a 17846/7 G insertion in the noncoding region. Further details on the mutations identified in Korean ASFVs are shown in Figure 1 and Table S3.

Notably, the Inje2/2021 strain exhibited numerous mutations concentrated in MGF 505-9R (13 mutations at nucleotide positions 43,882–43,934). Identical sequences were found at positions 45,814–45,866 in MGF 505-10R. The original sequences of MGF 505-9R (43,882–43,934) shared 75% identity with the MGF 505-10R gene, and nearby sequences also showed high homology (80.9%, nucleotide positions 43,874–43,941). These findings suggest that self-recombination may have occurred between the MGF 505-9R and MGF 505-10R genes in the ASFV/Korea/Pig/Inje2/2021 strain (Figure 2).

3.2. Phylogenetic Analysis
A phylogenetic tree was constructed to analyze the genomic epidemiology of ASFVs in South Korea. The maximum-likelihood phylogenetic tree revealed that genotype II ASFVs formed two distinct subgroups: Asian and European. All the Korean isolates clustered within the Asian subgroup (Figure 3). Although at least two distinct clusters specific to Korea were identified in the maximum-likelihood phylogenetic tree, most nodes in the tree did not receive strong bootstrap support (<70). Therefore, further investigation is required to explore the detailed genetic epidemiology of this maximum-likelihood phylogenetic tree.

In addition, a time-scaled Bayesian phylogenetic tree was generated to gain further insights into the detailed genetic epidemiology of ASFVs in South Korea. The time-scaled phylogenetic tree revealed that the Korean ASFVs were divided into at least three subgroups, with each subgroup sharing a common node supported by a high posterior probability (>0.9) (Figure 4). Notably, each ASFV subgroup exhibited a geographical pattern (Figures 4 and 5). Viruses isolated from north Gyeonggi-do (Yeoncheon, Paju) in 2019 and west Gangwon-do (Hwacheon, Hongcheon) during 2020–2021 formed a cluster in the phylogenetic tree, designated as Korean subgroup I. This cluster also included two ASFV whole-genome sequences (YC1/2019 and HC224/2020) detected in wild boars in 2019 (Yeoncheon) and 2020 (Hwacheon). The virus isolated from west Gyeonggi-do (Gimpo, Ganghwa) was clustered in Korean subgroup II. Furthermore, viruses isolated from Gangwon-do (Yeongwol, Goseong, and Inje) formed a distinct cluster, designated as Korean subgroup III. These findings suggest that at least three distinct viruses were introduced into South Korea through west and north Gyeonggi-do and east Gangwon-do. Five other isolates detected in Gyeonggi-do (Ganghwa, Paju) did not cluster with the other Korean isolates with a high posterior probability. These phylogenetic outliers indicate the possibility of multiple introductions of ASFVs in South Korea.


4. Discussion
In this study, we conducted a comprehensive analysis of the whole-genome sequences of 21 ASFVs isolated in South Korea between 2019 and 2021. Through our analysis, we identified 33 mutations in the Korean isolates compared with the reference strain Georgia 2007/1. Of these, 17 nonsynonymous mutations, four substitutions (T26425C, A44576G, T134514C, and T170862A), one truncation (C7059T), and one frameshift mutation (A deletion at 12578) were consistent in all 21 Korean ASFVs. These mutations were also detected in ASFVs isolated from various Asian and European countries between 2007 and 2021 [29]. The A44576G (K323E) substitution in MGF-505-9R was documented in an ASFV isolate from Armenia as early as 2007. Similarly, truncation by C7059T and three substitutions (A44576G, T134514C, and T170862A) were present in ASFVs detected in several countries between 2017 and 2020. These results indicated that the mutations shared by the Korean isolates occurred before the virus was introduced into South Korea.
SNPs and Indels found in Korean isolates have also been detected in isolates from other countries. Frameshift mutations at 12578 and substitutions at T26425C have been detected in multiple countries and years. The frameshift mutation at position 12578 was more prevalent and was found in ASFVs from Lithuania, Poland, China, Vietnam, Russia, and Germany between 2014 and 2021. Substitution at T26425C was observed in ASFVs from China, Timor-Leste, Vietnam, Poland, and Armenia in 2018 and 2019. These findings suggest the possibility of multiple viral introductions into Korea, indicating that viruses were brought into the country at multiple instances, rather than mutations arising during local transmission. Furthermore, we identified a unique C insertion at position 33042/3 of the MGF 360-14L coding sequence. This insertion was observed in multiple Korean ASFVs from 2019 to 2021 as well as in ASFVs from Timor-Leste and Vietnam in 2019. The presence of this insertion in the Korean and international ASFVs suggests its potential significance as a relatively recent mutation. Proteins p30 (CP204L gene), p54 (E183L gene), and p72 (B646L gene) are suspected to be major epitopes for antibody-mediated protection. However, Korean isolates have not shown any mutations in these proteins in this study. Further studies are required to determine whether these mutations, particularly protein truncations, contribute to changes in the biological characteristics of the virus.
Our study also provides further support for the existence of distinct subgroups of ASFVs in Koreans, indicating multiple introductions of the virus over time. Through time-scale phylogenetic analysis, we found that groups 1 and 2 were initially isolated in Yeoncheon and Gimpo, respectively, and outlier groups that did not cluster with other Korean isolates were identified in Ganghwa and Paju. These findings suggest the continuous introduction of distinct viruses in areas near the DMZ. In group 3, the first outbreak was observed in Yeongwol, Gangwon-do, in 2021, a region geographically distant from the previously affected areas. Our phylogenetic analysis suggests the possibility of group 3 viruses being introduced through the DMZ near the east coast. However, it is important to acknowledge the limitations of our epidemiological investigation, particularly the absence of related reference sequences, including those from North Korea and wild boars. This leaves open the possibility that these clusters result from ASFV genetic mutations in wild boars, despite the virus’s low mutation rate. Hence, global collaboration for the whole-genome sequencing of ASFVs from domestic pigs and wild boars is necessary to enhance our understanding of the epidemiology and transmission dynamics of the virus.
Recombination is a potential source of viral evolution, including changes in host range, virulence or pathogenesis, tissue tropism, resistance to antivirals, and the emergence of new viral diseases [38]. Previous studies have reported recombination events in ASFV, including homologous recombination, leading to genomic Indels that contribute to the genetic diversity of the virus [39], and recombination among different genotypes facilitated by the presence of recombination hotspots, resulting in the generation of diverse genetic strains [40]. In our study, we identified the possibility of self-recombination in the MGF 505 gene of the ASFV/Korea/Pig/Inje2/2021 strain. We detected a concentrated mutation in 13 SNPs within 52 bp of the MGF 505-9R genes, which shares an identical sequence with MGF 505-10R. Recombination could occur due to template switching by the polymerase among DNA or RNA strands that have high sequence identity [38]. Although MGF 505-9R and MGF 505-10R are paralogous proteins, their original sequences showed a total identity of approximately 61.4%. However, the MGF 505-9R showed 80.9% identity with that of MGF 505-10R in the region of putative recombination occurred. However, the precise mechanism of recombination between different gene locations in viruses not fully determined yet. Although Inje2/2021 exhibited potential self-recombination in MGF 505-9R, there were no discernible differences in virulence in pigs compared with the first Korean ASFV Paju1/2019 [28].
It is worth noting that although recombination events between different ASFV isolates have been reported, self-recombination of ASFV has not been previously documented [40, 41]. MGF 505/530 genes are believed to play important roles in virus tropism, virulence, and suppression of the interferon response, along with MGF 360 gene although the precise function of the encoded proteins is not fully understood [42]. Further studies are needed to elucidate the mechanisms underlying this recombination event and its implications for changes in biological characteristics, particularly the roles of proteins within individual viruses.
Despite the limitations of our study, primarily the lack of related reference sequences, genomic epidemiology using whole-genome sequences of ASFV provides valuable information on viral epidemiology. Therefore, continuous molecular epidemiological studies based on whole-genome sequences of ASFV are crucial for monitoring the origin of outbreaks and strengthening surveillance efforts. In this study, we present evidence for multiple possible introductions of ASFV through the DMZ in South Korea. These findings underscore the persistent challenge of repeated introduction of ASFVs into South Korea despite previous strain elimination of the virus through quarantine strategies. Therefore, intensive disease control measures are needed in regions such as Ganghwa-gun and Paju-si, where wild animals are likely to cross the DMZ, to prevent the introduction of new viruses.
In conclusion, our study provided valuable insights into the genetic diversity, mutations, and subgroups of ASFVs in South Korea. The identified mutations and subgroups suggest multiple introductions of ASFV strains into the country over time. These findings emphasize the need for intensified disease control measures, particularly in regions near the DMZ, to prevent the introduction of new viruses.
Data Availability
The whole-genome sequences used in this study were deposited in GenBank, and the accession numbers are provided in Table 1. Sequence alignments and results from phylogenetic analyses are available from the corresponding authors on reasonable request.
Conflicts of Interest
The authors declare that there is no conflict of interest with respect to the research, authorship, and publication of this article.
Authors’ Contributions
Jung-Hoon Kwon and Yeun-Kyung Shin conceptualized the study. Oh-Kyu Kwon, Jida-Choi, Ki-Hyun Cho, and Seong-Keun Hong collected samples and outbreak data. Da-Won Kim, Ji-Yun Kim, and Dong-Wook Lee performed genetic data curation. Da-Won Kim performed genetic analysis. Oh-Kyu Kwon and Da-Won Kim wrote the first draft of the manuscript. Jin-Ju Nah, Yeun-Hee Kim, and Hae-Eun Kang revised draft manuscript. Jung-Hoon Kwon and Yeun-Kyung Shin revised final manuscript. Oh-Kyu Kwon and Da-Won Kim contributed equally to this work. Jin-Jun Nah, Hae-Eun Kang and Yeun-Kyung Shin obtained funding for this work. Oh-Kyu Kwon and Da-Won Kim two authors contributed equally.
Acknowledgments
We thank our colleagues in the veterinary laboratories of provincial governments for assistance in collecting, screening, and transporting clinical samples. Animal and Plant Quarantine Agency, Republic of Korea (Grant Number I-1543085-2022-24-01).
Supplementary Materials
Table S1: primers for Sanger sequencing of partial MGF-505-9R. Table S2: list of reference African swine fever virus sequences used in phylogenetic analysis. Table S3: summary of genetic variations detected between 21 Korean African swine fever viruses sequences and the reference strain, Georgia 2007/1 (NC_044959). (Supplementary Material)