Monarch geneset OGS2.0

DPOGS204755
TranscriptDPOGS204755-TA1038 bp
ProteinDPOGS204755-PA345 aa
Genomic positionDPSCF300231 - 105475-107120
RNAseq coverage371x (Rank: top 32%)
Annotation
HeliconiusHMEL0127565e-13171.84% 
BombyxBGIBMGA002855-TA2e-7548.86% 
Drosophilarho-6-PB2e-1932.42% 
EBI UniRef50UniRef50_Q7QKF91e-2133.71%AGAP009451-PA n=4 Tax=Culicidae RepID=Q7QKF9_ANOGA
NCBI RefSeqXP_001649294.13e-2430.33%hypothetical protein AaeL_AAEL004502 [Aedes aegypti]
NCBI nr blastpgi|1571063716e-2330.33%hypothetical protein AaeL_AAEL004502 [Aedes aegypti]
NCBI nr blastxgi|1700590752e-2234.48%conserved hypothetical protein [Culex quinquefasciatus]
Group
Gene OntologyGO:00042522.3e-22serine-type endopeptidase activity
GO:00160212.3e-22integral to membrane
GO:00065083.8e-19proteolysis
KEGG pathway 
InterPro domain[159-303] IPR0227642.3e-22Peptidase S54, rhomboid domain
[160-272] IPR0026103.8e-19Peptidase S54, rhomboid
Orthology groupMCL26573 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204755-TA
ATGACTCCCGACCTGGAGGAGTGTCCAGAGACGGAGAGACTGAACGGGGACTACCAGGAGGAGACATCCCATCAGGGCCGCAGAGTTCTACCGGACGAGCCGCCCGCTGTGCAGGCCTGGGTCGAGAACGAGGAGCCGGGGCTGGCCTGCGATAGTAGGACCAACAGTCCCGGAGAGAACAGGAGACTCCTGGGAAATAAGCCTCATCACAAATCAACCGCTAAACTCCAAGCTGGTCTCACGTTAGGAGTCTGGGGAGCTCCAAAAAAGCACGCGAAATACCCACGAAGTTTGAAAAATGTAAAGACAAAACATAACACGAAGTCGAAGTTCCAAAAACGAAAGGAAAAATTAATAGAAGCATTAAAGCCGCCCTACTTCATCATCACCATGCTGAGTATTATTGTGATGGTACATTTCTGTGGACCAGACAAGTGGCGTGCGACGCTCGAGTGGTCGCCGGGTGGCTGGTGGCGGGAACCCTGGAGACTTCTCACGTACGGCTTCGTCCACGCCGGCCCCGCGCATTTAGCACTTAACGCAATAGTTGCCCTAACGGTGGGGTGGCGTCTGGAGCGCGAGCAAGGTTGGTCTCGTGTGGCGCTGGTGTGGGCGGGCGGTGTGGCAGCGGGCGCCTTGGGAGCTGGTGCCTTACAACCTCACGTCAGGGTGGTGGGTTCTTCGGCCGCCGTCTACGCTCTTCTCACCGCGCACATACCCAATGTCTGTCTGAGGTTCGGTCATATCCCTCTGTGGTGGTTCCGTCCCCTGAGTGTGGTCGTATTGGGCGCCTCGGAGCTGTGCTGGGCGCTGCTGCAGGCTCCGGATCAGAAAGAAAGCAACCATGTAGCGTGGGGAGCACACGCGCTCGGGGCTGCGGTCGGCGTGCCGCTAGCGTTCATCGCCTTCACAGGTGAAAATTCGTCACAGACCAAAATCACAGTTTGTCGCGTAGCTTCCTGTATGTTGTTGGCGGCCGGGGTCCTCGCTGCTATTATGCATTACATGTTTTGGGCGGACTTTGATCACTTCACCTGA

Protein sequence:

>DPOGS204755-PA
MTPDLEECPETERLNGDYQEETSHQGRRVLPDEPPAVQAWVENEEPGLACDSRTNSPGENRRLLGNKPHHKSTAKLQAGLTLGVWGAPKKHAKYPRSLKNVKTKHNTKSKFQKRKEKLIEALKPPYFIITMLSIIVMVHFCGPDKWRATLEWSPGGWWREPWRLLTYGFVHAGPAHLALNAIVALTVGWRLEREQGWSRVALVWAGGVAAGALGAGALQPHVRVVGSSAAVYALLTAHIPNVCLRFGHIPLWWFRPLSVVVLGASELCWALLQAPDQKESNHVAWGAHALGAAVGVPLAFIAFTGENSSQTKITVCRVASCMLLAAGVLAAIMHYMFWADFDHFT-