Monarch geneset OGS2.0

DPOGS203504
TranscriptDPOGS203504-TA3147 bp
ProteinDPOGS203504-PA1048 aa
Genomic positionDPSCF300055 - 614336-620230
RNAseq coverage170x (Rank: top 51%)
Annotation
HeliconiusHMEL0054914e-1030.58% 
BombyxBGIBMGA008561-TA0.067.47% 
DrosophilaAP-47-PA4e-1031.93% 
EBI UniRef50UniRef50_Q7QF900.070.07%AGAP000363-PA n=5 Tax=Pancrustacea RepID=Q7QF90_ANOGA
NCBI RefSeqXP_968279.20.056.02%PREDICTED: similar to clathrin coat assembly protein, partial [Tribolium castaneum]
NCBI nr blastpgi|2700056600.056.11%hypothetical protein TcasGA2_TC007752 [Tribolium castaneum]
NCBI nr blastxgi|2700056600.056.52%hypothetical protein TcasGA2_TC007752 [Tribolium castaneum]
Group
Gene OntologyGO:00068867.7e-50intracellular protein transport
GO:00055157.7e-50protein binding
GO:00301317.7e-50clathrin adaptor complex
GO:00161927.7e-50vesicle-mediated transport
KEGG pathwaygga:4213075e-50 
 K03122 (TFIIA1)maps-> Basal transcription factors
InterPro domain[1-1037] IPR0171105e-252Stonin
[678-1000] IPR0089687.7e-50Clathrin adaptor, mu subunit, C-terminal
[680-708] IPR0013925.5e-06Clathrin adaptor, mu subunit
Orthology groupMCL16371 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203504-TA
ATGGAATTTACAAGTTCACACTTACTGGATCGTATACCCCCAACAAGAACACCAAGTCCTGTATCAGTGAGAGATATTCACTCCCCAAGTCCGACCCCTGAGCCATATTTGTTTGAAGATAACACAAGCACAACGCAAACGGTCAATAAACCTACACGTCCTCCACCTGTACGGCCTCCTCGTCCTGCACCGCCTCCCAGACCGCAACCACCGGTTCAAGCACCAGTTCAAACAGCTCCAGATGACATCAATCTTTTTGATGCTCCTGTACCTACCACCATAAAACCAACAAAAGAAGCTATATTAAGTTTATATTCAGCACCTAAAAAGGAAGAAAAACAAATTGATTTTTTAAGTGATGATATCCTCGAATTATCAAACGAAACGGACATGCAACCTACGGAAGGTTTTGGACAACCTGCCAAAAATTCTCAAACTTCAGAAGCGACTCACTTACTTTTCAATGATTCAGTTTCATCTATCTCAGATAATATAAATCATTCGGAAGTTCAAAAGTCTTTATTCGAATCGGACAACATAAATCCTAACGAAGACACCAATGTTGCAATGGATTGTACAGACCTATCGGAAAAGAATGTAAACGTCAGTGCTACAAATACATCCCCATTTGCTGACATGTCGGACACAAATTTTCAAACACAACAAGGTGAAAATGAGAAGAATCCTTTTGAGTTATCGGGAGTTATTGACAATAATTCTCCATTTGCTACTTCTGATCTTATGAATTCTGAAATACTGACTGAGCCTGCAGATATTGTGATGACTTCTAGTCATAAACAACTTATAAATGACGAGTTACACATTGTAACTACAACTGAGCCCAAAAAGGCACCAGAACAACAGATGAATGAGGTAAATGTCTTTGCCGCTATGGCAACAGACGATTTTGTTGAAAAATCCAATATATTTGCCTCAGTACCTCAAGAGACAACATCTCACAATGATGAAAATATCTCATCCAGCAACACAAACTGGGGCACAACTGAGAAGATGGAAACCGCTTTTCCTGAAACTCAAGACGACTTTGACGCATTTTCAGCTAAATTTGAATGCGCTGGAGGAAATCAAATGAACACAGACGCATGGGGCGATGACAGTAGTGCTGACCTTGCTCCAGGCGGTTTCGAAGCCGAAGCATTTGATTCGTTTTTAAGTATGGACGCTCCACCGGCTCCAGCGGCTACCCCAAGATTAGATCACCAGGAATCTAAAGATTCTATGGACGATACATTTTCGGTATTTATAAGACCCAAAGAAGGTGAATTCAGTACAAACGAAGCTTCGGTGCCCGTTCTGGCTCCACCGCCAAGACCCCAGGTACTCCCAGAATCACCACGAGCAAATCCCTTCGATAAGAATGAAACTATAGGAATGCAACAAAGAATCGATTTTGATGAAGCAGCGCAGAAACCTGATGGTAGCGGTGCGGAGACACCTCCGACACCGCTCTTTGACGAGGACGTTTCGGTTCCGCTAGAGGACTTCCCTCGTACCAAGTATGCGGGACCGGGTTGGGAGATGCATCTTCGGCAACCGAACAAGAAAAAGATCACAGGACAAAGATTCTGGAAAAAAGTCTTCGTCCGCCTTGCAAACCCGGGTGACAGTCCCGTGGTGCAACTGTTGAACTCAGCCACAGACAAGCAACCGCTGCAAGAACTGCCACTGCAGGCGTGCTACTCGGTGTCAGAAGTAGGCGCACAGCAATTTGATCAGTTCGGAAAAATCTTCACTGTCAAGCTACAGTATGTCTTCTACAAGGAGCGTCCGGGAGTGCGACCAGGCCAAGTAACGAAAGCGGAGCGTATCACAAATAAGCTGTCTCAGTTCGCTGCGTACGCCATCCAGGGAGACTACCAAGGTGTCAAAGAGTTCGGCAGCGATCTCCGCAAGCTGGGGCTGCCTGTAGAGCATGCGCCGCAGGTTACACAACTGTTCAAGTTGGGATCACTGGACTACGAGGCCGTGAAGCAATTTTCGTGCTGCGTGGAGGAGGCCTTATTCCGCTTGTCTGCCCATCGAGACCGCGCCCTTACCTACAAGATGGAGGAGGTACAGCTGACGGCGGTGGACGAACTTTACGTGGAGCAAGACGCGGAGGGTGCCGTGCTAAAGCAAATAGCTCGAGTGCGACTCTTCTTCCTGGGTTTCCTTAGCGGTATGCCAGACATCGAGCTGGGGGTCAATGACTTGAGACGACAAGGCAAAGAGGTTGTAGGCCGCCATGACATCATCCCCGTAGCAACCGAGGAGTGGATCCGTGTGGAGGACGTGGAATTCCATGCGTGCGTGCAGCCGCAGCAATTCCAAGATACGCAGATCATCAAGTTCAAGCCGCCGGACGCGTGCTACATCGAGCTGATGAGGTTTCGGGTGCGACCTCCTAAAAACCGCGAGCTGCCGCTTCAGCTCAAGGCCGTCTGGTGCGTCACCGGAAACAAGGTGCGTGCTCACCGGGTGGAACTGCGCGCGGACGTACTGGTGCCAGGGTTCGCGTCGCGCGCTCTCGGTCAAGTTCCATGCGAGGACGTCGCTGCACGCTTTCCCATTCCCGAGTGCTGGATCTATCTGTTCCGAGTAGAAAAACATTTCCGATACGGTTCTGTGAAGTCCGCCCACAGACGTACTGGTAAGATCAAAGGCATTGAGCGTTTCCTCGGCGCTGTGGACACACCTCAGGAATCCCTCATCGAGGTGACTTCGGGGCAGGCCAAGTATGAACACCAGCACCGTGCCATCGTGTGGCGTATGCCTCGCTTGCCCAAAGAAGGACAAGGCGCATACACGACCCATAATCTTGTGTGTCGTATGGCTTTAACATCGTACGATCAAATTCCGGCCGAACTGGCACGTTGGGCTTTCGTCGAGTTCACGATGCCCGCGACTCAGGTCAGCCACATGGTGGTACGATCCGTGTCGCTCCAGGACCACGACGGGGACCCTCCCGAGAAGTACGTGCGTTACCTCGCGCGACACGAGTACAAGGTGGCCATCGAGCGTACGACGGGTGCCCCGACGGCCGCATATGCGCTGGCGGCCGCCAAGGATACCGCCCCCGCCTCCCCGGCCGCAGCCGCAGACCCCGCTCCGCCACAGGAACCCTCCTCGGACTCCGACTCCGACTAG

Protein sequence:

>DPOGS203504-PA
MEFTSSHLLDRIPPTRTPSPVSVRDIHSPSPTPEPYLFEDNTSTTQTVNKPTRPPPVRPPRPAPPPRPQPPVQAPVQTAPDDINLFDAPVPTTIKPTKEAILSLYSAPKKEEKQIDFLSDDILELSNETDMQPTEGFGQPAKNSQTSEATHLLFNDSVSSISDNINHSEVQKSLFESDNINPNEDTNVAMDCTDLSEKNVNVSATNTSPFADMSDTNFQTQQGENEKNPFELSGVIDNNSPFATSDLMNSEILTEPADIVMTSSHKQLINDELHIVTTTEPKKAPEQQMNEVNVFAAMATDDFVEKSNIFASVPQETTSHNDENISSSNTNWGTTEKMETAFPETQDDFDAFSAKFECAGGNQMNTDAWGDDSSADLAPGGFEAEAFDSFLSMDAPPAPAATPRLDHQESKDSMDDTFSVFIRPKEGEFSTNEASVPVLAPPPRPQVLPESPRANPFDKNETIGMQQRIDFDEAAQKPDGSGAETPPTPLFDEDVSVPLEDFPRTKYAGPGWEMHLRQPNKKKITGQRFWKKVFVRLANPGDSPVVQLLNSATDKQPLQELPLQACYSVSEVGAQQFDQFGKIFTVKLQYVFYKERPGVRPGQVTKAERITNKLSQFAAYAIQGDYQGVKEFGSDLRKLGLPVEHAPQVTQLFKLGSLDYEAVKQFSCCVEEALFRLSAHRDRALTYKMEEVQLTAVDELYVEQDAEGAVLKQIARVRLFFLGFLSGMPDIELGVNDLRRQGKEVVGRHDIIPVATEEWIRVEDVEFHACVQPQQFQDTQIIKFKPPDACYIELMRFRVRPPKNRELPLQLKAVWCVTGNKVRAHRVELRADVLVPGFASRALGQVPCEDVAARFPIPECWIYLFRVEKHFRYGSVKSAHRRTGKIKGIERFLGAVDTPQESLIEVTSGQAKYEHQHRAIVWRMPRLPKEGQGAYTTHNLVCRMALTSYDQIPAELARWAFVEFTMPATQVSHMVVRSVSLQDHDGDPPEKYVRYLARHEYKVAIERTTGAPTAAYALAAAKDTAPASPAAAADPAPPQEPSSDSDSD-