ERAP1



ERAP1
ENSP00000296754

This Project

The purpose of this project is to annotate specific genes in the newly sequenced whale shark genome in order to contribute to the scientific study of the whale shark and the immune system.

Background Information
The ERAP1 gene is known as “endoplasmic reticulum aminopeptidase.” This protein is crucial for the function of peptide trimming, which allows peptides to be of the correct length. ("ERAP1 Protein (Homo Sapiens) - STRING Network View" 2015). This regulated length is required to generate most HLA class I-binding peptides. Furthermore, it was found by Ian A. York, in the department of Pathology at the University of Massachusetts Medical School, that ERAP1 trims MHC class I-presented peptides in vivo and plays a significant role in immunodominance. It was found in this study that when the ERAP1 protein lost just a single enzyme, there was a shift in the immunodominance in viral infections. This affects the ability to destroy or create antigenic peptides, and reduces the immune response to some viral peptides (York 2006). Also relating to proper peptide functioning, by cleaving proteins into small peptides, the peptides can be recognized by the immune system. If the immune system recognizes the peptide as foreign, it will trigger a response that will cause the infected cell to self-destruct (Aldhamen et al. 2013). The ERAP1 gene may also play a role in the inactivation of peptide hormones and the regulation of blood pressure ("Endoplasmic Reticulum Aminopeptidase 1" 2015). The ERAP1 protein is essential for immune system function due to its role in protein cleaving, which allows for normal immune response against viruses and infections.


erap1.jpg
erap1.jpg

Figure 1. This image represents the role of ERAP1 in peptide cleaving (Crimi 2007)




Protein Domains

ERAP1 domains included the M1_APN_2 domain, ERAP 1_C domain, and the Peptidase_M1 multidomain, as outlined in figure 2. The M1_APN_2 domain, also known as the, Peptidase M1 Aminopeptidase N family, induces the tricorn 3 factor for the ERAP1 and APQ. The members included in this family are both eukaryotic and bacterial components. Its function is to preferentially cleave the neutral amino acids from oligopeptide N-termini in a variety of human tissues. The ERAP1_C domain is a large terminal domain that is geared towards the active site of the peptidase domain. The peptidase M1 is a family of aminopeptidases that have differing, but very specific functions. These functions range from hydrolysing N-terminal residues that are acidic, basic or neutral.The enzymes included in this domain possess the ability to perform aminopeptidase functions. All information about the domains present in this protein, and the functions of such protein domains were found on the NCBI BLAST site.




external image BB4ctul2vnXwOyeW6pD7m5bU88PLAvcA6U7_DjwlNdHX1vuMNfFtpVOjxQX6jcG2Uc_1vJhOjbuZqpDVOAhgAY9az6lMWyfssZ0f0mTlORVkWoeDyk3vpcSyTStYc0VjxYuw9ek
(Figure 2: Domains of ERAP1 protein, as given by BLAST results. Source: http://blast.ncbi.nlm.nih.gov)


Methods


Whale Shark Protein Sequence
The ERAP 1 protein sequence was found by searching for the sequence ID (ENSP00000296754) the Ensembl database and downloading the sequence from the top hit in the fasta format. From there the protein sequence was used as the query for a search of the NCBI BLAST database. The full protein sequence of the best match was then copied and used as the query for a BLAST against the predicted whale shark proteome on the galaxy server. The top predicted protein matches, determined by lowest E-value and highest alignment length, were then themselves used as queries for a BLAST against the NCBI human proteome database to check for reciprocity.

Phylogenetic Tree
The protein matches of each species with the smallest E-value for each BLAST using the human protein as the query were added to a clustalw multiple sequence alignment to create the phylogenetic tree as well as the three whale shark predicted proteins with the next smallest E-values. Default settings of the clustalw website were used.


Whale Shark Protein Sequence
The human protein sequence was used as the query of the predicted whale shark proteins database. The best four matches are tabulated in table 1 below and ranked by descending E-value.

Sequence ID
E-Value
Alignment Length
Percentage of Positive Matches
Reciprocal Name
Reciprocal E-value
g29177.t1
1e-26
60
83.33
ERAP 1 isoform a precursor
7e-22
g34464.t1
2e-20
52
76.92
aminopeptidase N precursor
5e-23
g40744.t1
2e-13
128
50.00
glutamyl aminopeptidase
1e-35
g22270.t1
6e-13
134
49.25
aminopeptidase N isoform X2
7e-42

Table 1. Percentage of positive matches refers to the percentage of amino acids that have similar function in sequence, and reciprocal name/E-value refers to the name/E-value of the best match (by smallest E-value) of the human protein returned by doing a BLAST of the NCBI human protein database with the whale shark predicted protein as the query. Since g29177.t1 was the best match when human ERAP1 was BLASTed against the whale shark proteome and ERAP1 was the best match when g29177.t1 was BLASTed against the human proteome, g29177.t1 is a likely ortholog.
Since the predicted protein g29177.t1 is a reciprocal best match with human ERAP 1, meaning that both ERAP 1 and g29177.t1 return each other as their best match regardless of either being the query, it seems plausible that g29177.t1 is an ortholog of ERAP 1.


Orthologs
The ERAP 1 human protein was investigated to find orthologs in various other species and results are shown in table 2 below. Since reciprocal best matches were found for all vertebrates except the elephant shark, it can be said that the ERAP 1 protein is largely conserved throughout vertebrate life. Furthermore,despite the fact that exact orthologs were not found in the non-vertebrates, they still showed striking similarity in their proteins with even the yeast protein maintaining a 95% query coverage and 30% identity.
Species
Name
ID
Length
E-Value
Reciprocity
Human
ERAP 1 isoform a precursor
NP_057526.3
948
N/A
N/A
Mouse
endoplasmic reticulum aminopeptidase 1 precursor
NP_109636.1
930
0.0
Yes
Chicken
PREDICTED: endoplasmic reticulum aminopeptidase 1
XP_001232418.2
929
0.0
Yes
Clawed Frog
PREDICTED: endoplasmic reticulum aminopeptidase 1
XP_002933475.2
932
0.0
Yes
Zebrafish
PREDICTED: endoplasmic reticulum aminopeptidase 1 isoform X1
XP_005171857.1
943
0.0
Yes
Atlantic Cod
N/A
gadMor1_genscan_ HE570838.164_1
1453
0.0
Yes
Elephant Shark
N/A
calMil1_genscan_ KI635881.103_1
1271
0.0
No
Fruit Fly
CG8773
NP_650273.2
994
5e-171
No
Rice
Os02g0218200
NP_001046303.1
878
1e-143
No
Yeast
Ape2p
NP_012765.3
952
6e-127
No
Table 2. The best match (by smallest E-value, query coverage, and percent identity) for each species when a query of human ERAP 1 isoform a precursor is BLASTed against its NCBI or galaxy database. Length refers to the amino acid length of the protein and reciprocity refers to whether or not an NCBI BLAST of the human proteome with the protein listed as the query returns ERAP 1 isoform a precursor as a best match. A reciprocal best match is seen in all vertebrates except the elephant shark indicating it is a common protein among vertebrates, but that the whale shark may or may not have an ortholog since the elephant shark does not.


Phylogeny and Phylogenetic Tree
The phylogenetic tree created by all the protein sequences from Tables 1 and 2 shown in figure 3 largely corresponds with phylogenetic trees constructed using homologous anatomy and genome sequencing. Interestingly, the proteins that seem to have diverged first are the elephant shark and the two lesser matched whale shark predicted proteins. However, the predicted protein we believe to be an ortholog, g29177.t1, fits in the phylogenetic tree exactly where we would predict it to be, with a more recent common ancestor with mammals than the fruit fly, an invertebrate, but a less recent common ancestor than atlantic cod or zebrafish, both ray-finned fishes.
external image 729bkdVRxyYG13LmRyxdYVU_yRGdi0S2dmC07PtUIVTuBYyJv9dW9BioCDxHfZWZ0jUmJWKGwVLPr0ZUlS4b04vhT5sWvX0Owp4Xua3Rfgs696yQYjkFAR5wEbLfiacRcDmcMxU
Figure 3. The phylogenetic tree of ERAP 1 best matches from Table 2. The best matches of each species and the 4 best whale shark predicted proteins were aggregated using Clustalw where the point of divergence indicated the most recent common ancestor and length of the branches represents evolutionary time. The g29177.t1 protein is exactly where one would predict it to be based on the most recent common ancestor of the various species, more related to mammals than the fruit fly but less related than the zebrafish, which gives further evidence to g29177.t1 being an ortholog of ERAP1.
Conclusion
The phylogenetic tree, and the date acquired from BLAST analyses reveals a possible common trait of orthologous aminopeptidases shared among the human and Whale Shark. These aminopeptidases seem to be related through the presence of the ERAP1 gene.The ERAP 1 isoform a precursor, aminopeptidase N precursor, glutamyl aminopeptidase, and aminopeptidase N isoform X2 were the top four protein similarities between the human and Whale Shark BLAST, serving as an indication of shared orthologous genes between the two species. Additionally, the alignment between human ERAP1 and g29177.t1 occurs from 702 to 761, so it occurs entirely in the ERAP_C domain. It can hence be concluded that the ERAP1 is a closely related gene with closely related functions in whale sharks as performed in humans.

References


Aldhamen, Yasser A., Sergey S. Seregin, David P. W. Rastall, Charles F. Aylsworth, Yuliya Pepelyayeva, Christopher J. Busuito, Sarah Godbehere-Roosa, Sungjin Kim, and Andrea Amalfitano. "Endoplasmic Reticulum Aminopeptidase-1 Functions Regulate Key Aspects of the Innate Immune Response." Ed. Jörg Hermann Fritz. PLoS ONE 8.7 (2013): E69539. Web.

"Basic Local Alignment Search Tool." BLAST:. N.p., n.d. Web. 15 Apr. 2015.

Crimi, Bob. The Final Cut: How ERAP1 Trims MHC Ligands to Size. Digital image. Nature Immunology. N.p., 2007. Web.

"Endoplasmic Reticulum Aminopeptidase 1." ERAP1. N.p., n.d. Web. 01 Apr. 2015.

"ERAP1 Protein (Homo Sapiens) - STRING Network View." ERAP1 Protein (Homo Sapiens) - STRING Network View. N.p., n.d. Web. 01 Apr. 2015.

York, I. A. "Endoplasmic Reticulum Aminopeptidase 1 (ERAP1) Trims MHC Class I-presented Peptides in Vivo and Plays an Important Role in Immunodominance." Proceedings of the National Academy of Sciences 103.24 (2006): 9202-207. Web.