Abstract
SOXE transcription factors, including SOX8, SOX9, and SOX10, regulate several developmental events, such as sex determination, chondrogenesis, and neurogenesis. This study systematically identified six SoxE subfamily genes from the turbot (Scophthalmus maximus) genome and transcriptome, including SmSox8a, SmSox8b, SmSox9a, SmSox9b, SmSox10a, and SmSox10b. The duplicates of three SOXE members revealed that SoxE subfamily genes in the turbot underwent significant expansion. Relatively conserved exon-intron structures and intron insertions were detected using genomic structural analysis. Besides, phylogenetic analysis supported the previous classification of the SoxE subfamily. Results of the expression profile revealed that turbot SoxE subfamily genes may be involved in different activities, such as neurogenesis and gonad development. These findings would assist in the understanding of fish SoxE gene subfamily activities and evolution.
1. Introduction
The Sox family encodes various transcription factors (TFs) in the animal kingdom that regulate diverse biological processes [1]. The Sox TF family is characterized by the presence of a sry-related high-mobility group (HMG). On the basis of the structural homology of the HMG domain together with partial regions outside the HMG-box, the Sox family is divided into 11 subfamilies (A–K) [2, 3]. More than 80% homology in HMG-box sequences is observed for different Sox genes in a subfamily; moreover, they exhibit similar biochemical properties and biological functions [4, 5]. Studies on the structure and function of SoxE subgroups were the most extensive when compared with other Sox subgroups [6, 7].
In mammals and other higher vertebrates, the SoxE family is composed of three members, namely, Sox8, Sox9, and Sox10. For example, three SoxE genes were discovered in Homo sapiens, Mus musculus, and Gallus domesticus [7–9]. Although fish have a low evolutionary status among vertebrates, they are the most widely distributed and account for nearly half of the existing vertebrate species. Because teleost fish underwent teleost-specific whole-genome duplication (3R-WGD), a greater number of SoxE genes are present in fish compared with other vertebrates. For example (as shown in Table 1), channel catfish (Ictalurus punctatus) has 4 putative SoxE genes, zebrafish (Danio rerio) has 5, tongue sole (Cynoglossus semilaevis) has 6, Japanese flounder (Paralichthys olivaceus) has 6, pufferfish (Tetraodon fluviatilis) has 6, Nile tilapia (Oreochromis niloticus) has 6, and common carp (Cyprinus carpio) has 10 [3, 10–14]. Because of its special phylogenetic status, the study of SoxE evolution and function in fish is more attractive. SoxE modulates different bioprocesses in vertebrates, such as nervous system development [7, 15], skeletogenesis [16], and sex determination and differentiation [17, 18]. In fish, Sox9 is the most studied SoxE that exerts a crucial effect on sex determination and differentiation in P. olivaceus and medaka (Oryzias latipes) [19, 20]. In spotted sea bass (Lateolabrax maculatus), Sox8b, Sox9b, and Sox10 are upregulated in the brain, indicating that SoxE is a key regulator in central nervous system (CNS) development [21].
Turbot (Scophthalmus maximus) is a valuable commercial farming fish in the Chinese aquaculture industry, especially in Northern China [22–25]. The genome sequence of S. maximus has been published (GenBank Accession: PRJNA821077). The systemic characterization of the SoxE subfamily has been completed in certain fish species but not in S. maximus. This study comprehensively identified genes, analyzed sequence structures, and evaluated the evolutionary characteristics for the systemic analysis of the turbot SoxE subfamily. Moreover, this work analyzed gene expression profiles to investigate possible SoxE activities in adult tissues. Our findings would contribute to a better understanding of SoxE-related biological activities in turbot and in other teleost species.
2. Materials and Methods
2.1. Animals and Sample Collection
Turbot individuals (age: 2 years, mean length: 38 ± 3.62 cm, and mean weight 1.82 ± 0.21 kg) were obtained from an aquatic product market in Lianyungang (Jiangsu, China). The turbot individuals were anesthetized using MS-222, followed by the collection of tissues, such as the brain, gill, liver, heart, spleen, stomach, kidney, muscle, intestines, and gonad (ovary or testis). The collected tissues were frozen immediately in liquid nitrogen and stored at −86°C until RNA purification. All animal-based experiments were approved by the Animal Research and Ethics Committee of Jiangsu Ocean University, and the detailed experimental operations were consistent with previous studies [26, 27].
2.2. Identification of SoxE in S. maximus
The genes were identified on the basis of the conserved HMG-box in the SoxE genes of zebrafish (Danio rerio). The sequences were obtained from Ensembl (https://asia.ensembl.org/Danio_rerio/Info/Index) and compared against the tblastn of BLAST in NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi), with an E-value threshold level of 1e−6. All gene candidates were analyzed to determine the presence of the core motif RPMNAFMVW, which verified that the gene was Sox [2].
2.3. Sequence Analysis and Structure Construction of SoxE
We obtained DNA, mRNA, and protein sequences of the S. maximus SoxE subfamily by using NCBI (https://www.ncbi.nlm.nih.gov). Moreover, the exon-intron structures were acquired using reference genome-related annotation files. On the basis of the principle of equivalence, introns, exons, HMG-boxes, and open-reading frames were further identified, followed by mapping of the SoxE subfamily gene structure. Subsequently, the EXPASY compute pI/MW approach (https://www.expasy.org/tools/) was used to measure the isoelectric point (pI), molecular weight (MW), and other attribute values of every SoxE protein. SMART (https://smart.embl.de/smart/show_motifs.pl) was adopted to predict conserved HMG sequences in SoxE proteins, while DNAMAN 8.0 was used to perform multiple alignments. WebLogo (https://weblogo.berkeley.edu/logo.cgi) was used to present six multiple sequence alignments of S. maximus.
2.4. Phylogenetic Analyses of SoxE
A total of 59 full-length sequences of SoxE proteins were downloaded from NCBI and Ensembl. The sequences of different organisms, such as M. musculus, H. sapiens, G. domesticus, X. laevis, O. Latipes, D. rerio, I. punctatus, P. olivaceus, C. semilaevis, T. rubripes, O. niloticus, and C. carpio, were downloaded. SMART was used to identify and retrieve HMG domains for phylogenetic analysis. ClustalW was used to align HMG-box sequences, followed by phylogenetic tree construction by using the neighbor-joining (NJ) approach. For this, the Poisson model from MEGA 7.0, with 1000 bootstrapping replicates, was used.
2.5. RNA Isolation and Real-Time Quantitative Reverse Transcription PCR
Total RNA was extracted from the tissues by using TRIzol (Invitrogen, CA, USA), and first-strand cDNA was synthesized using MMLV and oligo (dT)18 reverse transcriptase (Thermo Fisher Scientific, USA) according to the manufacturer’s protocol. Specific primers (Table 2) were designed using Oligo 7.0. Primer specificities were examined through alignment against S. maximus transcriptomes (unpublished data) by using BLASTN, with the E-value being 1e−8 and β-actin being the endogenous control gene. Figure S1 displays the melting curves of different genes. Real-time quantitative reverse transcription PCR (qRT-PCR) was performed using the Light Cycler 480 Real-time PCR System (Roche Diagnostics, Mannheim, Germany) in triplicate. The reaction volume was 20 μL and consisted of 10 μL TB Green Premix EX Taq Mix (Takara, Japan), 1.6 μL cDNA templates, 8 μL ddH2O, and 0.2 μL respective primers. PCR conditions were 30 s at 94°C; 10 s at 94°C; 30 s at 60°C; and 30 s at 72°C for 35 cycles. The 2−ΔΔCt approach was used to determine SoxE expression, which was consistent with the previous method of calculating gene expression [28, 29]. SPSS 19.0 was employed for statistical analysis through an independent sample t-test. was considered statistically significant.
3. Results and Discussion
3.1. SoxE Identified in the S. maximus Genome
Six SoxE genes were identified in the S. maximus genome by using Ensembl and NCBI (Table 3). The cDNA length in S. maximus ranged from 2211 to 4592 bp, while the corresponding gene-encoded proteins were 464–499 aa in length. In addition, the predicted MWs were 50.88–54.42 kDa, and the pIs were 6.14–7.17. The S. maximus SoxE number is similar to those of P. olivaceus, C. semilaevis, T. fluviatilis, and O. niloticus (Table 1). In these teleost mentioned above, members of the SoxE subfamily all evolved into two orthologs in their genome (including Sox8a, Sox8b, Sox9a, Sox9b, Sox10a, and Sox10b). The reason for this phenomenon may be that these bony fish have undergone whole-genome duplication (3R-WGD events) during evolution, resulting in a significant expansion of members of the SoxE subfamily [30–32]. A similar phenomenon of double copies of members of the SoxE subfamily has also been seen in other teleost, such as the large yellow croaker and spinyhead croaker [33, 34]. However, Cyprinus carpio has more SoxE (including CySox8a, CySox8b, CySox8c, CySox8d, CySox9a, CySox9b, CySox9c, CySox9d, CySox10a, and CySox10b) compared with S. maximus and other aforementioned fish, implying that C. carpio underwent additional genome duplication events during evolution (4R-WGD events) [14].
3.2. Genomic Structure and Sequence Alignment of Turbot SoxE
The genomic structure of S. maximus SoxE was investigated and constructed using an online analysis tool (https://gsds.gao-lab.org). Relatively conserved exon-intron structures were observed in a SoxE subgroup [10]. On the basis of the structural distribution characteristics of introns and exons, turbot SoxE can be divided into two categories, those consisting of two and three introns, respectively. Specifically, SmSox8a, SmSox8b, SmSox9a, and SmSox9b have two introns, whereas SmSox10a and SmSox10b have three introns (Figure 1(a)). Intron insertion occurred in the HMG domain of all SmSoxE genes. The presence of introns in HMG boxes has also been observed in SoxE of other fish, such as Lateolabrax maculatus (LmSox8a, LmSox8b, and LmSox10), Danio rerio (DrSox10), and Paralichthys olivaceus (PoSox8a, PoSox8b, PoSox9a, PoSox9b, PoSox10a, and PoSox10b) [10, 13, 21]. Because of the genetic diversity among species, only the insertion state of flounder is completely consistent with that of turbot; however, only some SoxE in other fishes exhibit this phenomenon. We further aligned the six HMG domains and explored the corresponding intron positions in the domains (Figure 2). The logo plots of the HMG-box domain revealed relatively conserved genes. Moreover, all HMG domains comprised a core motif of RPMNAFMVW, which recognized and bound to cis-regulatory elements in relevant target genes’ promoters [12, 35] located at residues 5 to 13 (Figures 2(a) and 2(b)). The insertion locations of introns in the HMG domain of all SmSoxE genes were identical (Figure 2(a)). Therefore, the intron positions are evolutionarily conserved in the SoxE subgroup among species, a result consistent with those of previous studies [3, 13, 21].

(a)

(b)

(a)

(b)
3.3. Phylogenetic Analysis of Turbot SoxE
To identify six SoxE members and their clade, we used MEGA 7.0 to construct a nonrooted phylogenetic tree with 59 full-length sequences of SoxE proteins in twelve species. Turbot SoxE were clustered with corresponding counterparts, and three clades, namely Sox8, Sox9, and Sox10, were identified, conforming to prior classification results (Figure 3). Interestingly, most SoxE, including Sox9a, Sox9b, Sox10a, and Sox10b, can be clustered into their respective branches, indicating that SoxE of different species may have a common evolutionary origin. However, phylogenetic trees have some anomalies. For example, most Sox8a and Sox8b are clustered into different branches, which are close to Sox10 and Sox9 in the evolutionary tree structure branches. Similar results were also reported in the closely related species of Japanese flounder (P. japonicus) and tongue sole (C. semilaevis) [13, 36]. This may be attributable to the third WGD event, leading to all members of the SoxE subfamily having two parallel homologous genes. The Sox8a and Sox8b of flounder fish are more likely to act as substitutes for Sox10 and Sox9, respectively, thereby supplementing evolutionary adaptation and innovation.

3.4. Gene Patterns for Turbot SoxE in Different Tissues
SoxE TFs are involved in different physiological and biochemical events through the activation or inhibition of specific targets, depending on tissue and development [6, 21]. This study analyzed gene patterns for six turbot SoxE genes in 11 adult tissues. Expression patterns were observed for all turbot SoxE genes (Figures 4(a)–4(f)). Results showed that turbot Sox9a and Sox9b had higher levels in the gill, while the other SoxE genes had very low or negligible expression levels in the liver and gill. Notably, most SoxE genes expressed relatively higher in brain and intestine, and Sox8b as well as Sox10b expressed highest in brain compared with their expression in other tissues. In addition, expression of Sox8a in muscle, Sox9a in gill, Sox9b in stomach and Sox10a in testis were all relatively higher in their own expression profiles. We also observed the weak expression levels of all SoxE genes in the heart, spleen, and kidney, indicating the role of SoxE genes in these organs’ development needed further investigation. Generally, Sox8, Sox9, and Sox10, the TFs belonging to the subgroup E Sox protein family, exert crucial effects on numerous nervous system developmental processes in vertebrates. These TFs participate in the original neural crest occurrence and ensure pluripotency maintenance and survival in migratory neural crest stem cells. Moreover, they are crucial regulatory factors for glial norms in the CNS and peripheral NS [7, 37]. Therefore, several SoxE genes are simultaneously expressed in an organ, suggesting their coordinating activity in performing a function. Such activity is possibly essential for neurogenesis or for maintaining bioprocesses in the turbot brain.

(a)

(b)

(c)

(d)

(e)

(f)
In vertebrates, all three SoxE proteins (SOX8, SOX9, and SOX10) from diverse species exhibit high similarity in HMG nonbox structural conservation and box homology, with 95% similarity in amino acid sequence. This may be the reason behind the conserved function of SoxE among species [6, 7]. For example, SoxE proteins play multiple roles in the specification and differentiation of mammalian sex. Among the SoxE proteins, Sox9 is the most important and conserved sex-determining protein. After sex determination, a complicated positive feedback pathway exists among Sox8, Sox9, and Sox10; the pathway is necessary to maintain spermatogenesis and fertility in males [38]. SoxE paralogs Sox8, Sox9, and Sox10 have been isolated from many fish, and the research on gonad function has also made some progress. In adult Japanese flounder, Sox8b, Sox9a, Sox9b, and Sox10a are expressed in the testis and also in the ovary to a certain extent [13]. In adult spotted sea bass, Sox8a, Sox8b, Sox9b, and Sox10 are highly expressed in the testis and ovaries, whereas Sox9a is almost not expressed in both testis and ovaries [14]. According to our results, Sox8a, Sox9a, and Sox10a are upregulated in testis. The Sox9b upregulation in the ovary indicated its critical regulatory effect on testis and ovary development. The role of SoxE in turbot sex determination and development requires further exploration in gonads at different development stages and should be verified using gene knockout or RNAi. In addition, almost all SoxE genes were expressed at low levels in the spleen and kidney, indicating that these genes may not participate in spleen or kidney development and function maintenance, a result consistent with those of previous reports [3, 13, 14].
4. Conclusion
The present work identified six SoxE genes in S. maximus. Gene expression patterns in adult tissues provide crucial data regarding turbot SoxE activities, and the data should be verified using technologies such as gene knockout and RNAi. These findings would assist in the understanding of fish SoxE subfamily activity and evolution.
Data Availability
All data included in this study are available upon request upon contact with the corresponding author.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by funding from the Priority Academic Program Development of Jiangsu Higher Education Institutions (2021JSPAPD010), the Doctoral Program of Entrepreneurship and Innovation in Jiangsu Province (JSSCBS20221625), the Scientific Research Foundation Program of Jiangsu Ocean University (KQ22009), the Graduate Practice Innovation Program of Jiangsu Province (KYCX2023-112), Shandong Provincial Engineering Project for Enhancing Innovation Capability of Technology-based Small and Medium-sized Enterprises (2023TSGC0401), and Shandong Natural Science Foundation General Project (ZR2022MC041).
Supplementary Materials
Figure S1: Melt curves of all genes used in this study.(A)Sox8a, (B)Sox8b, (C)Sox9a, (D)Sox9b, (E)Sox10a, and (F)Sox10b. (Supplementary Materials)