Monarch geneset OGS2.0

DPOGS210690
TranscriptDPOGS210690-TA4896 bp
ProteinDPOGS210690-PA1631 aa
Genomic positionDPSCF300013 - 709063-735944
RNAseq coverage38x (Rank: top 73%)
Annotation
HeliconiusHMEL0208750.085.88% 
BombyxBGIBMGA006306-TA0.071.48% 
Drosophila% 
EBI UniRef50UniRef50_D6W8H20.049.09%Putative uncharacterized protein n=3 Tax=Endopterygota RepID=D6W8H2_TRICA
NCBI RefSeqXP_001814363.10.049.09%PREDICTED: similar to C3 and PZP-like alpha-2-macroglobulin domain-containing protein 8 [Tribolium castaneum]
NCBI nr blastpgi|2700027890.049.09%hypothetical protein TcasGA2_TC000808 [Tribolium castaneum]
NCBI nr blastxgi|2700027890.047.05%hypothetical protein TcasGA2_TC000808 [Tribolium castaneum]
Group
Gene OntologyGO:00048662.5e-10endopeptidase inhibitor activity
GO:00055763.4e-09extracellular region
GO:00056159.8e-08extracellular space
KEGG pathway 
InterPro domain[660-941] IPR0089301.8e-18Terpenoid cylases/protein prenyltransferase alpha-alpha toroid
[260-346] IPR0015992.5e-10Alpha-2-macroglobulin
[65-189] IPR0116252.7e-10Alpha-2-macroglobulin, N-terminal 2
[1042-1175] IPR0090483.4e-09Alpha-macroglobulin, receptor-binding
[788-907] IPR0116269.8e-08A-macroglobulin complement component
Orthology groupMCL18584 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210690-TA
ATGTATGAAGACAATGAGTTAGTCCGAAATAAAAATTTACGAACAGACTCTTATGCTCAGATATTTGGCTCTGCTGAGTTAGTAGCTGTGGAAGGCAAAGAGATAGAAACACACTATGTGTTGGCCCGGGAGAAAATACGTCGGTGGAATTCTACAACAAAATGTTACTTGCTCGTGGAAAACTTGCCTACGCCATTACAGGCTGGTGGAATTGCAAGCGCTAGTGTTTGGTCATCATGTGGGTGCCGTCAGCGTCTGTTAGCGGCGGTCACCAACGGTGGCCGTGCACTGCACTGGGCAGCTGTACCAGCACCGAAATCTGCCGATGAAAACGATTTGTGCCGTTTTAATTACACATTTCCAGTGACGGCTGACATGGCACCGATCAGTTCTCTGCTAGTTTATTACGTCACCGAGCTAGGCGAGCCAGTGAGTGACGTAGCCAGCTTCCACGTCAAACTACTACATAAGGAAGTGGCCGTAGCTATAGAAGACCGTCGATGGTGGTACCCACGAACCGCACTACAGCTTCGAGTGCTGGCACCGCCGGATTCGTTGATGTGTTTGATCGGAGCGCGTGCACTCACTGACTCAAGATTTGATACACATCAAGGTGAACCAGAACACGAAGAACAACCCGGGCCTGAGTTCGTGTCAGCGGGAGTATCACTGTTCGTTGGTGGTGGCACTTGCGGTGGTGGGGTGCTCTACCGACAGAAGACAGTCACACCTCGTGCACCAGCCCATTTAGTACCGCCCGCGTCCCACGACAGGCTTTGGATGTGGAAATGCTTTAATTATACAACTCAATTGTCAACTGATGGAGTAACAATAGCGGCGCCTTCAGAAGCTGGGCGCTGGTCTCTTTGGGCACTTTCTTTGTCTAATCGTGGTCTTCGATTCTCGGCTCCAAAAACCATTAACGTCTTTCGTCCGATACAACTGGACTTCTCCCTGCCACCTGCCCTGAAAGTCGGCGAAACAGTAGAAGTTGACGTCAAAATCACCAATAACATCAACAATTGTATGGACGTGACAGCCCTATTAGCGCTAAGTGCTGGGGCAGCTTTCGCGAGTACTGGTGCTTTATATGTCACTGAACGATTGAGACTCGGCCCGCGTGGTGGGACTCAGTTAGTTGTGAGAGTCGCTGTTAATACTCCCGGAAGGAAGAATATTACTGTGGAAGTAACTGGATATAGCGCTGATAACTGTACGGTGTCTTACACATCTTTCAACAACGAGACCCTTGTCGGTTCTGTAATTCGATCAGCGAGTGTATTGGTCCTACCGGAGGGACTACACCGGAGCGATACTCAGAGCGCATACTTCTGTGCTAACGAACATCTCGCGGTTTCTTCCCGTGGTTCATGGGAGTGGCAATGGGTGGCAGCGCCTCGTAACAGGGCAGGTCTTGTATTAGAATTGAAGGCACAGGGCGCAGCACATGTTGCATTATCGGCCGTGAGGGAACCATCCGATGATATGTATAGAGTTGTGATCGAGAGGAGTCGAGTATGGATTGCGAAAGGAAAACATGGTTATGACGTACACCTTGCCAGTGCGGAACAAACTGAAAGCGACGCGGACTGCTCTGGTGAGGACTCTTGGTGCGCTTGGTGGGTGTGGTGGGAGGGCGGTCGTCTTTCCGTCGGTAGAGGAGCATCTCCTTCAGAAAGAAGGTTGTTAGTATGGCCCCTTACAGCAGATATGAGGATAAAGTATGTCGGTTTCAGTGCGCTTTGGGGAGATCAAGCTGATTTTAGAATATGGAACTTCAATGAAGAAGCTGGATTTTCCCAAGTATTAGAATTAGGTCTACCCCATGGAGTGGTACCTGGTTCAGCGAGTGGGACGTTATTAATTTCCGGAGGTCTTCATCTTCCTTTATATAGTTTCCAAACGGATGCTTCAGATATATGGTCAGATGTTTGGAAAGATTCTCAATTATCAGCAGCTTCAGCTAGTTTGGCACCGTTATTAGCATTGGAACATATACCTCATTTAGTGGACGAAATGGAGAAGGAAAGAATATTGAATAAGCTACCTGAACAGGTACAAATACTACTTTCATTTCGTAAAAGCGATAACTCGTTCAGCGATCATCCAGCAGTAAGCAGTCATTTATCTACAATCAAAATCTTAGAAATTTTAAACAAAATTCAATCATATTACCCAGTGGATCCGGAACTTCTACAATCCATAAAATCTTGGATACAATCTAGGCAAAATCCAGATGGTTCCTTTACCCCACTTGCTGCAGACAAGGAAGTCGATTATTATCCTGTTGAAATAAAAAATGTAAACGGCACAGACGCTGAGTTTGATGTAAATGAATACTACTATTATGACAAAGATGGTAATATGACGCAAGAAGTAATTGAATATGAGAGAACCGTAGAAGTTACAGCAGAAACTTTGGTATCATTACTAGAAGTTGGAGTAGAAAATCAAGTAGATGCAGATGTTGCAAAACTAGCGCAAACGTACTTAGAGAATAATGTCCGGAATCTGACCTCGCCAGCCACTTTAGCAGCCACTGTTTTAGCGCTTGTTTTGGCAAGAAGTCCTATCGTACCTGAAGCGTTACTTATACTACGTAATGCATCAACTACTGAAGAAGGAGAGTTCGGTTGGCCAGCTCCCAGAAAGGATGCAGCAGATTGGCTCCTTGAAGAAACCTCTAGAAACATCAAAACCACTTCCTACGAAGCGGTTACAATGGAGCAGTATGTGGCTGGCGTGCGTGTGTTACTAGCTGCGTGTGCGCGGGGAGCCTTGGCGGAAGGAGAAGCGGCGGCTCGGTTCTTATACTATCGGGCATCAACTTTACAAAGGCATCCCAGTCTAGCATACCAAGCTACAAAAGCAGCTGCGCAGTACGCTGCACTGGCTCATGATAGACATAGAGCACTGACAGTATCTCTGGCTACAGCTGGAATGGAATTAACAGACACGTTAGAACTACGCGCGTTGACACCACCTCGGCCACTACAACTTCCAGGTCTACCTACTAAGGTGTTCGTATACGCCACCGGCGCTGGATGTGCCACTGTACAGGGCACAATATCATATTCGACGTATAATCCTAAAGCAGAAAATGCGCTGCTGAACATCCAAGCAGCTATTATTGAAGAGATAAGACCTGAACGAAGCAGCATCGAAGATTTGCAAGGAAACTTGCCGACATTGATCATTAAATCTTGCTTCAAATGGAAAGGAAAAGAGCGCTCCGGAATTCTTCGTTTAGAATCTTCTCTTTTCTCTGGCTATGAATTACATTCAGTAAATCCTGTTGTTCTTGATGGGGCCACGTTTGCTGACTTACATTACGGTTCGCGTGGAGAATCAGTGTGGTTTGTGTTTACTAATATTAGCTCCACTTGTCCGGTTTGCGTAACTTACGAAGCGAGATCAAAGTTCGTCATAACAAGCCTCCGTCCAGCATTTGCTAAAATTTATCCTTCAAGCAGACCAGATTTAGCTGTTGAAACATTCTTCCACGCAAGACCCGGAAGTCCTCTGTTAAGGGGTATCACAGATGATGATTTTATAACTTGGTTCGATAAAACCCAACGTGCTAGTCTAAAAACAAACACAAATATTGACAATATTTGTGAATGTGGTCGTATATGTAGTAGAGATTATGAATTTAGAAAGGATTACAAGAAAATGATGGAATCAACAACAACAGAGGAGACGACAACAGTAAAAATTACAGAACCAACTTTAACAACAGACTATAAGATATCAACAACAGATGTAGTAACAGACATTCAAAGTGACTTACCTTCTACTTTATCCTCAACTTCAATTACCATAGAAACCCAAGATATATCAACAACAAGTATGCCAGCCCCTGGAAATGATACATCGAAAGTTTCTAATGCCACAATATCTATAAATACTGACGATCCAATAATCATTCCCACTATAACATACGCAACGCAAACTGAAAACCAAAATATCAGTAATAACATTCTGCCAGCTGTCCCTATAATAAACGGTGAACTCATCGTACAAAAGCTTCCTGTTAATAAAAATTATGCAAAAAAACCTGACTTCAGTAAGAAACCATTGCCGCGACGTAAAGGTACATTAAAAGCAACCTACGGGGATAAACATGAAAAGTTCTTTTCAAAATCTAAAATCCCTGATGATTTGAATCTTATAAAAACTATAAAACCGGTTTATCAAGTTACCGAAACTTCTACAATAAAAGGTTTGACTACGACTACGTCAACACTAAAAGGTATAACTAGTTCTGAAGCAAAAACTCCTGAACACGATATTTCATCCACTATGAAGACGGAATTGAGGACTGTCACAGTATTTAATACAAATTCTACTATTATAACTCCAAGTAGTGTAACTGAAGACAAAATTAAGTCTAATAAAACTCTTATTTTCACTCAACCCGAAATAACTACGGTCCCCCACTCAATAACAGCAATCACAAAAAGCAATATTAAAACAATACACTATAGGACTAAGAAACCTAAGCCTAAGACACAAATTAAGAAGCCCAACATAAATACGAATAACACTAACGAGAAGCCTCTGAAAAATAATAAAACCACGAAACCTGAGATTGTTCTTAACACGACAAAAATAAGATTGGATTCGACCGAAAAATATGTATCAAAATCCTTAAAATCTATCAATAAGGAAATTCATAAAATACCGTTCACACCTGTTTCAGAAACCACTAAATCAAACAATATCCCTACGAAATCAGATATCGCACCTGAAAATAGAGAAGGGTACGAAATTTTAGACAAAAATAATCTTTGGGAGCTTCTTAAAGAAGGTCCGGATGATACTAAAATAGAAGATAAAATTAATGTTCACAATCGATTGAATGAAGTGTCATCTGTCAATAATCGTTCTTTATAA

Protein sequence:

>DPOGS210690-PA
MYEDNELVRNKNLRTDSYAQIFGSAELVAVEGKEIETHYVLAREKIRRWNSTTKCYLLVENLPTPLQAGGIASASVWSSCGCRQRLLAAVTNGGRALHWAAVPAPKSADENDLCRFNYTFPVTADMAPISSLLVYYVTELGEPVSDVASFHVKLLHKEVAVAIEDRRWWYPRTALQLRVLAPPDSLMCLIGARALTDSRFDTHQGEPEHEEQPGPEFVSAGVSLFVGGGTCGGGVLYRQKTVTPRAPAHLVPPASHDRLWMWKCFNYTTQLSTDGVTIAAPSEAGRWSLWALSLSNRGLRFSAPKTINVFRPIQLDFSLPPALKVGETVEVDVKITNNINNCMDVTALLALSAGAAFASTGALYVTERLRLGPRGGTQLVVRVAVNTPGRKNITVEVTGYSADNCTVSYTSFNNETLVGSVIRSASVLVLPEGLHRSDTQSAYFCANEHLAVSSRGSWEWQWVAAPRNRAGLVLELKAQGAAHVALSAVREPSDDMYRVVIERSRVWIAKGKHGYDVHLASAEQTESDADCSGEDSWCAWWVWWEGGRLSVGRGASPSERRLLVWPLTADMRIKYVGFSALWGDQADFRIWNFNEEAGFSQVLELGLPHGVVPGSASGTLLISGGLHLPLYSFQTDASDIWSDVWKDSQLSAASASLAPLLALEHIPHLVDEMEKERILNKLPEQVQILLSFRKSDNSFSDHPAVSSHLSTIKILEILNKIQSYYPVDPELLQSIKSWIQSRQNPDGSFTPLAADKEVDYYPVEIKNVNGTDAEFDVNEYYYYDKDGNMTQEVIEYERTVEVTAETLVSLLEVGVENQVDADVAKLAQTYLENNVRNLTSPATLAATVLALVLARSPIVPEALLILRNASTTEEGEFGWPAPRKDAADWLLEETSRNIKTTSYEAVTMEQYVAGVRVLLAACARGALAEGEAAARFLYYRASTLQRHPSLAYQATKAAAQYAALAHDRHRALTVSLATAGMELTDTLELRALTPPRPLQLPGLPTKVFVYATGAGCATVQGTISYSTYNPKAENALLNIQAAIIEEIRPERSSIEDLQGNLPTLIIKSCFKWKGKERSGILRLESSLFSGYELHSVNPVVLDGATFADLHYGSRGESVWFVFTNISSTCPVCVTYEARSKFVITSLRPAFAKIYPSSRPDLAVETFFHARPGSPLLRGITDDDFITWFDKTQRASLKTNTNIDNICECGRICSRDYEFRKDYKKMMESTTTEETTTVKITEPTLTTDYKISTTDVVTDIQSDLPSTLSSTSITIETQDISTTSMPAPGNDTSKVSNATISINTDDPIIIPTITYATQTENQNISNNILPAVPIINGELIVQKLPVNKNYAKKPDFSKKPLPRRKGTLKATYGDKHEKFFSKSKIPDDLNLIKTIKPVYQVTETSTIKGLTTTTSTLKGITSSEAKTPEHDISSTMKTELRTVTVFNTNSTIITPSSVTEDKIKSNKTLIFTQPEITTVPHSITAITKSNIKTIHYRTKKPKPKTQIKKPNINTNNTNEKPLKNNKTTKPEIVLNTTKIRLDSTEKYVSKSLKSINKEIHKIPFTPVSETTKSNNIPTKSDIAPENREGYEILDKNNLWELLKEGPDDTKIEDKINVHNRLNEVSSVNNRSL-