Abstract

Genome-wide association studies (GWAS) are a powerful tool for identifying genomic regions and causative genes associated with economically important traits in dairy cattle, particularly complex traits, such as milk production. This is possible due to advances in next-generation sequencing technology. This review summarized information on identified candidate genes and genomic regions associated with milk production traits in Holstein and its crossbreds from various regions of the world. Milk production traits are important in dairy cattle breeding programs because of their direct economic impact on the industry and their close relationship with nutritional requirements. GWAS has been used in a large number of studies to identify genomic regions and candidate genes associated with milk production traits in dairy cattle. Many genomic regions and candidate genes have already been identified in Holstein and its crossbreds. Genes and single nucleotide polymorphisms (SNPs) that significantly affect milk yield (MY) were found in all autosomal chromosomes except chromosomes 27 and 29. Half of the reported SNPs associated with fat yield and fat percentage were found on chromosome 14. However, a large number of significant SNPs for protein yield (PY) and protein percentage were found on chromosomes 1, 5, and 20. Approximately 155 SNPs with significant influence on multiple milk production traits have been identified. Several promising candidate genes, including diacylglycerol O-acyltransferase 1, plectin, Rho GTPase activating protein 39, protein phosphatase 1 regulatory subunit 16A, and sphingomyelin phosphodiesterase 5 were found to have pleiotropic effects on all five milk production traits. Thus, to improve milk production traits it is of practical relevance to focus on significant SNPs and pleiotropic genes frequently found to affect multiple milk production traits.

1. Introduction

Milk is a highly nutritious and valuable human food consumed by millions of people every day in a variety of flavors and products. Milk production traits, such as milk, fat, and protein yields (PYs), and fat and protein percentages (PPs), are the essential economic traits that are used to evaluate milk quantity and quality and play a major role in dairy development [1]. Milk traits are influenced by multiple genes, and therefore genomic evaluations have the potential to rapidly increase the rate of genetic improvement for these traits in dairy [2]. Understanding genetic variation in dairy cattle is crucial to associating genomic regions with milk yield (MY) and composition traits. The sequencing of the bovine genome in 2004 sparked a worldwide effort to improve how cattle genetic values can be estimated using basic genetic coding information [3].

Detecting genomic regions will help to identify potential candidate genes that may be responsible for genetic variation in MY and milk composition traits. These candidate genes could help to improve our understanding of the biological background of milk production traits. Genome-Wide Association Studies (GWAS) are a popular method for determining, which genes and gene regions influence the expression of specific phenotypes by identifying single nucleotide polymorphisms (SNPs) associated with the phenotypes across the whole genome [4, 5]. GWAS can effectively identify potential genetic variants associated with quantitative traits, and facilitate the utilization of molecular information for genomic selection in dairy cattle [6, 7].

GWAS have been extensively used in recent years to identify genomic regions and candidate genes for milk production traits in Holstein and its crossbreds in cattle populations from various countries. Numerous candidate genes and quantitative regions associated with milk production traits in Holstein and its crossbreds have already been identified [79, 16, 17]. The objective of this review was to summarize the findings of genomic regions and candidate genes associated with milk production traits including MY, FY, PY, FP, and PP in Holstein and its crossbreds.

2. Methods

Data were gathered from Google Scholar, Science Direct, PubMed, Springer Link, Web of Science, and Scopus using the keywords GWAS, genomic markers, Holstein, crossbred, and milk production traits. The current review included published studies that discussed candidate genes and genomic regions that were significantly associated with milk production traits in Holstein and its crossbreds. We included studies that used a -value as a statistical significance criterion. In addition, we included studies that reported both SNPs and candidate genes. Similarly, only articles published in English in peer-reviewed journals since 2009 were included in this review. Thus, conference papers, books, book chapters, theses, and unpublished results were excluded from this review. To ensure consistency throughout the review, SNP names that differed from what researchers reported were converted to the rs name format.

3. GWAS for Milk Production Traits in Holstein and Its Crossbreds

The phenotypic expression of milk production traits (MY and milk composition) is controlled by many genes. The detection of potential candidate genes affecting milk production traits of cattle is made possible by the widespread availability of SNP markers through the fast-growing number of genotyped cattle [16]. Several GWAS focused on the identification of potential candidate genes and genomic regions underlying milk production traits (MY, FY, FP, PY, and PP). Most researchers conducted association studies using 50 K chips, except [7, 18]; who used 26 and 100 K chips, respectively. The methodologies they used were linear, single-locus, multi-locus, and Bayesian mixed models. This review summarized the 462 significantly associated SNPs from which 34 SNPs for milk production traits were repeatedly reported by various researchers at different rates. Ten SNPs were reported three and more than three from 34 SNPs: rs109421300, rs109350371, rs109146371, rs109558046, rs109752439, rs109234250, rs109968515, rs110199901, rs17870736, and rs43703011. While the ramming 24 SNPs were reported twice. For instance, rs109421300 was reported by [11, 13, 14, 17, 21].

Diacylglycerol O-acyltransferase 1 (DGAT1) was the most frequently reported candidate gene associated with one or more milk production traits by multiple authors [9, 11, 13, 14, 17, 19, 22]. GHR was reported by [11, 13, 21, 23]. MAPK15 was reported by [15, 21, 24]. KHDRBS3 was reported by [7, 16, 21]. The remaining candidate genes were reported by fewer than four researchers. Researchers [8, 16, 18] conducted association studies for milk production traits with crossbred dairy cattle ranging from 87.50% to <100% Holstein, Holsteinized Black-and-White Pied and Gir × Holstein (Girolando) in Thailand, Russia, and Brazil, respectively, using a single marker linear model. The remaining studies included in this review were conducted with Holstein and its crossbreds.

3.1. Milk Yield

MY is the most economically important trait, and several researchers were keenly interested in identifying the genes and genomic regions that contribute to its variation in Holstein and its crossbreds [7, 11, 13, 16, 17]. Several publications that utilized GWAS for the MY are shown in Table 1. These researchers reported 103 individual SNPs that were significantly associated with MY. These SNPs were found on all autosomal chromosomes except chromosomes 27 and 29 in Holsteins and their crossbreds. Figure 1 shows the frequency of SNPs identified by different researchers within each chromosome. Chromosomes 14 and 20 have a high number of SNPs. This information could be used to help focus research on these two chromosomes to improve MY.

The candidate genes significantly affecting MY that were reported more than twice (Table 1) were GNA14 in Thai Holstein crossbreds [8], PTBP2 in U.S. Holstein [19], and U6 in Brazilian Holstein crossbreds [16].

3.2. Fat Yield and Fat Percentage

Fat is an important component of milk and it is controlled by gene networks associated with several metabolic and biological pathways. The identification of potential genes and their locations can provide valuable information that can be used for selective breeding to improve milk quality. A total of 46 significantly associated SNPs with FY and 117 significantly associated SNPs with FP were detected in various chromosomes from Holstein and its crossbreds. Several researchers [9, 12, 17, 19, 20] mentioned more than twice that two SNPs (rs109350371 and rs109421300) that were significantly associated with FP. Figure 2 shows the number of identified significant SNPs associated with FY and FP in chromosomes from Holstein and its crossbreds. Chromosome 14 contains a large number of significant SNPs associated with FP accounting for more than 75% of the SNPs on this chromosome. Conversely, for fat yield (FY), chromosomes 5 and 14 have an equal number of significantly associated SNPs.

A detailed list of the candidate genes, significant SNPs, and chromosome numbers for FY and FP is presented in Table 2. Several candidate genes influence the expression of FY, including inositol 1,4,5-trisphosphate receptor, type 2 (ITPR2), ATP-binding cassette sub-family C member 9 (ABCC9), sulfonylurea receptor 2 (SUR2), cleavage and polyadenylation specific factor 1 (CPSF1), DGAT1, phosphodiesterase 4B (PDE4), and methyl transferase like 15 (METTL15) reported by [7, 12, 13, 17, 25]. Similarly, multiple candidate genes influence the expression of FP, including 5-oxoprolinase, ATP-Hydrolysing (OPLAH), G protein-coupled receptor 20 (GPR20), collagen type XXII alpha 1 chain (COL22A1), glutamate receptor ionotropic NMDA type subunit associated protein (GRINA), forkhead box H1 (FOXH1), microsomal glutathione S-transferase 1 (MGST1), ephrin type-receptor A6 (EPHA6), and alanine and arginine rich domain containing protein (AARD) reported by [9, 11, 13, 15, 19, 21, 27].

3.3. Protein Yield and PP

Candidate genes, significant SNPs, and chromosome numbers for PY and PP are presented in Table 3. There were 44 significantly associated SNPs for PY and 101 significantly associated SNPs for PP in Holstein and its crossbreds. Figure 3 shows the number of significant SNPs associated with PY and PP in chromosomes from Holstein and its crossbreds. Many significant SNPs were reported on chromosome 20, and about half of the significant SNPs for PP were identified on chromosomes 20, 6, and 5. In addition, chromosomes 1 and 5 had a large number of significant SNPs for PY.

Table 3 shows potential genes, significant SNPs, and chromosomes associated with PY and PP. Genes associated with PY, included pyruvate dehydrogenase E1 subunit alpha 2 (PDHA2), C-terminal binding protein 2 (CTBP2), mitogen-activated protein kinase 9 (MAPK9), Hermansky-Pudlak syndrome-3 (HPS3), ADP ribosylation factor guanine nucleotide exchange factor 1 (ARFGEF), solute carrier organic anion transporter family member 1A2 (SLCO1A2), major facilitator superfamily domain containing 1 (MFSD1) [7, 14, 20, 21, 24, 25]. Findings indicate several potential genes associated with PP, for example, growth hormone receptor (GHR), nipped-B-like protein (NIPBL), platelet-derived growth factor receptor alpha (PDGFRA), peroxisome proliferator-activated receptor gamma coactivator 1-alpha (PPARGC1A), casein kappa (CSN3), RNA polymerase II associated protein 3 (RPAP3), solute carrier family 1 member 3 (SLC1A3), and zinc finger protein 384 (ZNF384) [7, 11, 14, 19, 22, 23].

3.4. All Milk Production Traits

A total of 136 SNPs were significantly associated with two or more milk production traits (MY, FY, PY, FP, and PP). According to Fontanesi et al. [22], rs109234250 was significantly associated with all milk production traits (MY, FY, PY, FP, and PP). As reported by [11, 12, 15, 17, 21, 22], 14 SNPs frequently affected four, 39 SNPs three, and 86 SNPs two of milk production traits. Number of significant SNPs associated with multiple milk production traits in Holstein and its crossbreds are shown in Figure 4. There was a greater number of SNPs frequently affected multiple milk production traits on chromosome 14. Thus, selection programs should focus on candidate genes and genomic regions that are known to influence multiple production traits.

Candidate genes, significant SNPs, and chromosomes that are simultaneously associated with more than one milk production trait are listed in Table 4. Several promising candidate genes were identified, including DGAT1, PLEC, Rho GTPase activating protein 39 (ARHGAP39), protein phosphatase 1 regulatory subunit 16A (PPP1R16A), and sphingomyelin phosphatase 5 (SMPD5). Genes retinol saturase (RETSAT), AarF domain containing kinase 5 (ADCK5), arc regulates transcription adhesion G protein-coupled receptor B1 (ARC-ADGRB1), Rho GTPase activating protein 39 (ARHGAP39), DGAT1, forkhead box H1 (FOXH1), PLEC, solute carrier family 52 member 2 (SLC52A2), and prolactin receptor (PRLR) frequently affected four milk production traits [11, 12, 15, 21].

4. Conclusion

This review summarized information on identified candidate genes and genomic regions associated with milk production traits in Holstein and its crossbreds from various regions of the world. Most of the identified SNPs and candidate genes were on chromosome 14. One of the challenges in dairy cattle selection is that milk production traits are expressed after the first calving. Candidate gene and genomic region information would permit earlier selection of males and females, shorten the generation interval, and accelerate genetic progress for milk production traits.

Conflicts of Interest

The author(s) declare(s) that they have no conflicts of interest.