MOLECULAR ANALYSIS OF A PUTATIVE EN/SPM-RELATED TRANSPOSON PROTEIN IN BRASSICA JUNCEA

Open reading frame (ORF) of Putative En/Spm transposon in Brassica potentially helps to understand the relationship between various eukaryotic transposable elements. The current study was initially conducted to isolate and analyze the putative En/Spm-related transposon gene from Brassica juncea. PCR products (750 bp) from B. juncea (accession PI 649105 and PI 271442) were cloned, sequenced, and analyzed. Results of BLAST showed identical sequences between two accessions with 100% similarity. The amplified DNA transposon and conserved domain compared to the GenBank database to evaluate the genetic diversity relationships. Sequence of this putative transposon gene from B. juncae was a 98% similar to B. rapa subsp pekinensis at the nucleotide level, and 94% with the En/Spm-related transposon protein of B. oleracea at the amino acid level. Conserved domain architecture was related to transposase_21_pfam0299 and transposase family tnp2 and had a relationship with space outside the cell structure and/or to space outside the plasma membrane.


Purification, A-Tailing and Cloning of PCR Products
In accordance with the protocol defined by (El Fiky et al., 2019), PCR products were purified. Six µl from PCR product, 1 µl BD buffer 1x, 0.5 µl firepol, 2 µl dATP and 1 µl MgCl2 were mixed in microfuge tube and incubated at 70ºC for 20 min for A-Tailing overhangs using FIREpolR DNA polymerase (Solis BioDyne) Cat# 01-01-0000S. The RBC T&A cloning Kit (Real Biotech Corporation RC001 RBC T&A Cloning Kit / RC013 RBC T&A Cloning Vector) was used to ligate Atailing of PCR products into cloning vector as a manufacturer's description.

Preparation of E.coli Competent Cells, Transformation, and Plasmid Extraction
Competent cells were prepared using the protocol mentioned (Sambrook, 2001). Pellets resuspended in 80 µl ice cold 85 Mm CaCl2, 15% glycerol and saved at -80 ºC. To transform the competent cells, E. coli was transformed using Rapid Transformation Procedure included in One Shot TOP10 Chemically Competent Cells (Life Technologies Corporation) was used as a user guide. One hundred µl of the resulting culture was spread on LB plates with ampicillin antibiotic (100 µg/ml) and colonies picked after about 12-16 hours to extract recombinant plasmid. Purified recombinant plasmid DNA was stored at -20 °C until used in sequencing.

Sequencing and Analyses of a Putative Transposon-Gene and Protein
Recombinant plasmid was sequenced by automated DNA sequencing with M13 forward and reverse primers using a sequencing ready reaction kit (Life Technologies) in combination with ABI-PRISM and ABI-PRISM big dye terminator cycler. Sequences of cloning DNA were subjected to alignment with sequences of the GenBank from Brassica using the BLAST 2.2.18 (Basic Local Alignment Search Tool) algorithm at https://blast.ncbi.nlm.nih.gov. The DNA sequences were subjected to the SIB Bioinformatics Resource Portal (ExPASy) which provides access to translate open reading frames (https://web.expasy.org/translate/) and the protein aligned using the BLASTP 2.2.18. MEGA, version 5.2 (Tamura et al., 2011) was used to produce a phylogenetic tree of a putative gene and protein mediated the UPGMA method according to (Sneath and Sokal, 1973). The evolutionary relationships were calculated using Maximum Composite Likelihood method (Tamura et al., 2004). For each query protein sequence, LocTree2 applies machine learning (kernel SVM profile) was used to predict the native sub-cellular localization in 18 classes for eukaryotes (https://rostlab.org/services/loctree3/).

Cloning of PCR product:
PCR amplification of a putative transposon gene, at the nucleotide level, showed one sharp band with a molecular weight 750 bp in each of the two B. juncea accessions, whereas no bands were detected in B. napus cultivar Serw 4 ( Figure 1). The fragments with A-Tailing overhang PCR products were cloned into RBC T&A cloning vector and resulted in RBC-a putative transposon gene. Recombinant plasmids extracted from transformed E.coli (DH5α) were sequenced.

Analysis of Putative Transposon Gene Sequence
Sequence analysis using BLASTN software showed 100% similarity and 0.0% gaps between the two B. juncea sequences (Figure 2). Brassica juncea putative transposon gene (MH674328) subjected to the GenBank. This sequence showed 98% similarity and 93% query covered with B. rapa subsp pekinensis. UPGMA topology of the tree of this putative gene from B. juncae with 14 accession numbers of B. rapa subsp Pekinensis in the GenBank database was a monophyletic group ( Figure 3). Successful grouping of DNA sequences into two major clusters. The first cluster was highly diverse and composed of putative gene from B. juncae (MH674328). The second cluster had 14 accession numbers of B. rapa and divided into two closely related accessions sub clusters.

Analysis of Putative Transposon Protein Sequence
DNA transposon is a recurrent source of coding sequences to advent new genes. ExPASY is a translate tool which allows the translation of a nucleotide (DNA) sequence to a protein sequence (Amino acid). Results of the translation showed that the 5'3' frame 1 had one open reading frame (ORF), and contains 242 amino acids and the sequence of this protein was subjected to the GenBank, EMBL, DDBJ and PDB (Accession number QAU19549). This open reading frame (ORF) was subjected to BLASTP software to determine the alignment between this ORF and protein database collections. Results showed a 94% similarity and 100% queries covered between this protein and En/Spm-related transposon protein. Putative conserved domain with transposase_21 has been detected between nucleotide176-250. UPGMA tree of putative transposon protein with 16 protein accession numbers in the GenBank database represented a monophyletic group (Figure 4). The tree had successfully grouped into two main clusters, the putative transposon protein sequence (accession number QAU19549) closely related with En/Spm-related transposon protein (accession number ACG60686) in the first cluster. The second cluster is highly diverse and composed of 15 uncharacterized proteins from Brassica.

Analysis of Putative Transposon Gene Sequence
As a result of cytogenetic and breeding studies, the genetic relationship between Brassica oilseed species was largely established. Morinaga, Arabidopsis. For this reason, the obtained results clearly indicate that the B. juncea and B. rapa subsp Pekinensis genome sequences provide an valuable resource for analyzing the evolution of amphidiploid genomes through its transposon genes.

Analysis of Putative Transposon Protein Sequence:
The Conserved domain database (CDD) classifies proteins into functionally different families and sub-families, and throught conserved domain architectures to facilitates comparative studies of protein families (Marchler-Bauer et al., 2017). The results showed that conserved protein domain family is transposase_21, pfam02992 and transposase family tnp2 and the conserved domain architecture of this enzyme showed a protein homology with putative transposase, putative retrotransposase and putative CACTA transposon proteins. (Nouroz et al., 2017) reported that not only Brassica CACTA or En/Spm transposases conserved in diploid Brassicas but actively proliferate in Brassicas allotetraploid (B. juncea, B. napus, B. carinata) and Arabidopsis sister genera. Many families of the En/Spm superfamily are not readily recognize by computer assisted database searches (Wang et al., 2003;Wicker et al., 2003). The availability of genomic sequence data as well as sequence search tools has allowed bioinformatic methods to identify DNA transposon families based on sequence similarity to a known class II. because the terminal sequences of DNA transposons are often the only demand for transposase recognition (Craig et al., 2002). DNA transposons are grouped into several families, based on structural diversity of transposases, of which 6 (CACTA, hAT, Harbingers, Helitron, Mutator and Mariner) are widespread in plants (Wicker et al., 2007;Kapitonov and Jurka, 2008). The genome of Brassica also includes harbour transposable elements (TEs) such as LTR retrotransposons (Nouroz et al., 2015c), DNA transposons such as Mutator (Nouroz and Noreen, 2015), hATs (Nouroz et al., 2015b) and Harbingers (Zhang and Wessler, 2004;Nouroz et al., 2016), as found in other plants. Functionally, only a few genes have been investigated, so the functional contribution of the transposase domain(s) to the corresponding protein remains a matter of speculation in most cases. However, it is possible to determine some predictions based on the functional analyses of associated transposases. So we selected this ORF amino acid sequence and applied into LocTree3 to predict the results output ( Figure 5). These results showed that the protein obtained from this study (id: LOC) had expected the accuracy of the prediction. 91% with the id: GO0005576 which defined as a space external to the outermost structure of a cell. This relates to space outside of the plasma membrane for cells without external protective or external encapsulating structures (Goldberg et al., 2014).

CONCLUSION
Current study detected band of 750 bp and the sequence analysis indicated highly similar to B. rapa subsp Pekinensis. Conserved domain architecture is related to transposease_21_pfam0299 and transposease family tnp2 and had a relationship with space outside a cell's outermost structure of a cell and/or to space outside the plasma membrane. Putative En/Spm transposon in Brassica potentially helps in understanding the relationship between various eukaryotic transposable elements.