This Project

Figure 1. Computer model of RUNX3 [1].
Figure 1. Computer model of RUNX3 [1].
This web page originated as an assignment in Emory University's Biology 142 lab course.
Students were assigned proteins of interest and asked toresearch what is known about the protein and to examine whether the newly sequenced whale shark genome had evidence of an orthologous protein.


Background Information

RUNX3 is a protein that is a part of the runt domain family of transcription factors. These transcription factors are important regulators of lineage-specific gene expression in developmental paths of organisms (Fainaru et al, 2004). RUNX3 is smallest gene in the RUNX family of genes because it has the lowest number of exons (Bangsow et al, 2001). The RUNX3 is also a key regulator in autoimmune diseases. For example, it is commonly expressed in lymphoid populations and is responsible for the development of neurons and signaling in dendritic cells (Brenner et. al, 2004). In mouse, RUNX3 is most commonly expressed in developing bones and sensory ganglia (Lebanon et al, 2001). Within the dorsal root ganglia of the nervous system, RUNX3 is a neuron-specific transcription factor expressed specifically in the tyrosine kinase receptor proprioceptive neurons (Lebanon et al, 2002). In addition, RUNX3 acts as a tumor suppressor in several cancers, most commonly ovarian cancer and breast cancer. For ovarian cancer, the RUNX3 gene is expressed in the nuclei. It can also act as an oncogene, which can cause the ovarian cells to turn in cancerous tumor cells (Lee et al, 2011).


Methods

Human Protein Sequencing

First the Ensembl website [2] was utilized in order to find RUNX3’s amino acid sequence in FASTA format. Its Ensembl ID: ENSP00000308051 was entered into the human database.

Finding orthologs

The resulting amino acid sequence was then blasted against the whale shark database on the Whale Shark Genome Galaxy website [3]. The best protein match’s FASTA sequence was then recorded and underwent a reciprocal blast against the human RUNX3 sequence in the NCBI blast site [4]. Other homologs were then detected by blasting the human RUNX3 sequence and the whale shark on the NCBI website against the databases of the elephant shark, mouse, zebra fish, and yeast.

Phylogenetic Tree

The resulting FASTA sequences for the protein with the lowest E-value and highest percent identity and query coverage were then recorded and entered into ClustalW [5] along with the Whale Shark and human protein sequences to compare the protein sequences and its phylogeny. With the results, a phylogenetic tree was generated.


Results & Discussion

Protein Domains

A BLAST was performed with the Human RUNX3 protein sequence against the database for whale sharks and other species including the elephant shark, zebra fish, yeast, and mouse. The results are shown below in Figure 1. The main protein domains found were the Runt superfamily and the RunxI superfamily. The reciprocal blast was also performed, which compared the whale shark protein sequence against the human database and the databases of the species previously mentioned. The results are depicted below in Figure 2. Similarly, the Runt superfamily was identified as the main protein domain that aligned between the whale shark and other species. These superfamilies entail all of the Runx-related transcription factors (Marchler-Bauer A et al., 2015). This shows conservation of the Runt transcription factor protein domain through organismal speciation.


Screen Shot 2015-04-13 at 5.05.57 PM.png
Figure 2. The human protein sequence for RUNX3 was blasted against the protein databases of the whale shark, elephant shark, mouse, zebra fish, and yeast. The protein sequences were compared and the major protein domains identified were the Runt superfamily and the RunxI super family.


Screen Shot 2015-04-13 at 5.06.59 PM.png
Figure 3. The Whale Shark protein sequence, resulting from the BLAST of the human RUNX3 protein against the whale shark database, was blasted against the databases of the human, elephant shark, mouse, zebra fish, and yeast. The two sequences were compared and the major protein domain, the Runt super family, is pictured above.


Orthologs

The results of the best hits from the blast of the human RUNX3 protein sequence to the databases of the whaleshark, elephant shark, zebra fish, yeast, and mouse are shown below (Table 1). The best hits from the blast of the whale shark’s protein sequence (g.4726.t1) to the databases of the human, zebra fish, mouse, elephant shark, and yeast are also shown (Table 1).

Blasting the human RUNX3 sequence resulted in a protein hit with a very high percent identity (89.22%) and a very low e-value (6e-94), but a low query coverage (42.75%). The reciprocal blast resulted in a match with a very low e-value (1e-104), high percent identity (88%), and good query coverage (73%) but with the the runt-related transcription factor 1 instead - not RUNX3. The higher percent identity and query cover of the whale shark sequence to the human RUNX1 protein as opposed to the RUNX3 protein suggests that the transcription factor that the whale shark possessed within the Runt superfamily is more similar in structure to the RUNX1 protein found in other organisms. The data also suggests that the human protein is most closely related to the protein found in mice, and least related to the protein found in yeast. The whale shark protein, from this set of data, seems to be about equally related to all the other organisms but much less so with the yeast protein.
Query
Database
Description
Query Cover
E Value
Identity
Human
Whale Shark
g.47267.t1
42.75%
6e-94
89.22%
Human
Elephant Shark
runt-related transcription factor 3
100%
0.0
72%
Human
Zebra Fish
runt-related transcription factor 3
100%
7e-171
64%
Human
Yeast
UV-damaged DNA-binding protein RAD7
20%
.19
19%
Human
Mouse
runt-related transcription factor 3
98%
0.0
91%
Whale Shark
Human
runt-related transcription factor 1 isoform AML1a
73%
1e-104
88%
Whale Shark
Zebra Fish
runt-related transcription factor 1
74%
4e-101
85%
Whale Shark
Mouse
runt-related transcription factor 1 isoform 4
73%
2e-104
89%
Whale Shark
Elephant Shark
runt-related transcription factor 1
73%
2e-99
84%
Whale Shark
Yeast
Vba5p
40%
5.1
23%
Table 1. The human RUNX3 protein was blasted against the protein databases of the whale shark, elephant shark, zebra fish, yeast, and mouse. The best hits, the proteins with the lowest e-value, are shown in the table pictured above. The reciprocal BLAST was then conducted using the whale shark protein, g.4267.t1 against the protein databases of the human, zebrafish, mouse, elephant shark, and yeast. The best hits are also shown above.

Phylogeny

The human RUNX3 sequence was blasted against the protein databases of the mouse, elephant shark, whale shark, zebra fish, and yeast. The amino acid sequences were then compared and the resulting phylogenetic tree is pictured below (Figure 3). The phylogeny suggests that the human and mouse have closely related protein sequences and speciation between them occurred most recently. A similar analogy can be applied to the elephant shark and whale shark. For all four of these organisms, it would appear that they evolved from the same common ancestor. The yeast is the least similar to the other species in terms of its protein sequence similarity.


Screen Shot 2015-04-13 at 3.37.26 PM.png
Figure 4. The RUNX3 human protein sequence was compared against the protein sequences of the whale shark, elephant shark, zebrafish, mouse, and yeast. The highest hit's FASTA sequence was compared to the original RUNX3 sequence in ClustalW and resulting phylogenetic tree is shown above. The human protein is most closely related to the protein found in mice. The whale shark protein is most closely related to the one found in the elephant shark. The yeast protein has the least in common with the proteins from other species.



Conclusion

Based on the data collected, we conclude that the whale shark has some transcription factor protein that derives from the same Runt superfamily as the human RUNX3 protein, but is probably more related the human RUNX1 protein. The initial BLAST between the human protein against the whale shark protein database resulted in a protein with very short query coverage of 47.25%. With the original BLAST of the human RUNX3 protein sequence against the whale shark database there were two conserved protein domains: the Runt superfamily and the RunxI superfamily. However, the reciprocal blast only conserved the Runt superfamily. This suggests that the protein found in the whale shark is not completely identical to the one found in humans.The high query coverage and percent identity along with the low e-value of the reciprocal blast's alignment with the human RUNX1 protein (as seen in Table 1) suggests that the whale shark has a form of the RUNX1 protein instead. By looking at the phylogenetic tree, we can possibly trace how the evolution of the original transcription factor protein diverged between unicellular and multicellular eukaryotes then again in zebrafish, then again between humans and mice and whale sharks and elephant sharks. Implications for further research include finding the functionality of the orthologous protein found in whale sharks. This would be important to possibly understanding the autoimmune system in whale sharks and the protein as a transcription factor.


Works Cited

Bangsow, C., Rubins, N., Glusman, G., Bernstein, Y., Negreanu, V., Goldenberg, D., Lotem, J., Ben-Asher, E., Lancet, D., Levanon, D., Groner, Y. (2001). The RUNX3 gene—sequence, structure and regulated expression. Gene 279: 221–232.

Brenner, O., Levanon, D., Negreanu, V., Golubkov, O., Fainaru, O., Woolf, E., Groner, Y (2004). Loss of Runx3 function in leukocytes is associated with spontaneously developed colitis and gastric mucosal hyperplasia. PNAS 101: 16016-16021.

Fainaru, O., Woolf, E., Lotem, J., Yarmus, M., Brenner, O., Goldenberg, D., Negreanu, V., Bernstein, Y., Levanon, D., Jung, S. and Groner, Y. (2004). Runx3 regulates mouse TGF-β-mediated dendritic cell function and its absence results in airway inflammation. The EMBO Journal 23: 969–979.

Lee C., Chuang, L., Kimura, S., Lai, S., Ong, C, Yan, B., Salto-Tellez, M., Choolani, M., Ito, Y (2001). RUNX3 functions as an oncogene in ovarian cancer. ScienceDirect 122: 410-417.

Levanon, D., Bettoun, D., Harris_Cerruti, C., Woolf, E., Negreanu, V., Eilam, R., Berstein, Y., Goldenberg, D., Cuiying, X., Filegauf, M., Kremer, E., Otto, F., Brenner, O., Lev-Tov, A., Groner, Y. (2002). The RunX3 transcription factor regulates development and survival of TrkC dorsal root ganglia neurons. The EBMO Journal 21:3454-3463.

Levanon D, Brenner O, Negreanu V, Bettoun D, Woolf E, Eilam R, Lotem J, Gat U, Otto F, Speck N, Groner Y (2001a). Spatial and temporal expression pattern of Runx3 (Aml2) and Runx1 (Aml1) indicates non-redundant functions during mouse embryogenesis. Mech Dev 109: 413–417.

Marchler-Bauer A et al. (2015), "CDD: NCBI's conserved domain database.", Nucleic Acids Res. 43(Database issue):D222-6.


References

1. ^ Protein Data Bank in Europe. (n.d.). Retrieved from http://www.ebi.ac.uk/pdbe-srv/view/entry/1cmo/summary.html

2. ^ http://useast.ensembl.org/index.html

3. ^ http://whaleshark.georgiaaquarium.org/

4. ^ http://blast.ncbi.nlm.nih.gov/Blast.cgi

5. ^http://www.genome.jp/tools/clustalw/