Monarch geneset OGS2.0

DPOGS215883
TranscriptDPOGS215883-TA3087 bp
ProteinDPOGS215883-PA1028 aa
Genomic positionDPSCF300029 - 186126-316728
RNAseq coverage60x (Rank: top 68%)
Annotation
HeliconiusHMEL0207331e-9786.60% 
BombyxBGIBMGA000439-TA4e-11595.13% 
DrosophilaCG33253-PA3e-9867.52% 
EBI UniRef50UniRef50_E3X2Y46e-10372.56%Putative uncharacterized protein n=2 Tax=Anopheles darlingi RepID=E3X2Y4_ANODA
NCBI RefSeqXP_001661914.12e-10572.08%prohibitin, putative [Aedes aegypti]
NCBI nr blastpgi|1571305555e-10472.08%prohibitin, putative [Aedes aegypti]
NCBI nr blastxgi|910851932e-10476.77%PREDICTED: similar to AGAP003352-PA [Tribolium castaneum]
Group
Gene OntologyGO:00160204.5e-165membrane
KEGG pathwaydpo:Dpse_GA141453e-22 
 K03364 (CDH1)maps-> Ubiquitin mediated proteolysis
    Cell cycle - yeast
    Progesterone-mediated oocyte maturation
    Cell cycle
InterPro domain[364-625] IPR0019724.5e-165Stomatin
[397-556] IPR0011073e-60Band 7 protein
Orthology groupMCL16363 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215883-TA
ATGCTGTCAAAGGTTCTTACGTTAGTAGCACTACTGGGTGCAGCAAGTGCTCAACATTATTCTCACGGACAAGCCACTTCATCCCAGTCCATCATTCGTCATGATAACCAACACGGATACAACCACGTTCAACCTATCGCCGTCCATTCAGCTCCTATCTCATACCATCAGCCTGCTATAGCTGTTCACGCTGCCCCAGTCGCCAGTCACGTCAGTGTTCAACACGCTACTCCAGTCATTCAACATGTTGCTCCAGTCATTCAACACGTGGCTCCCATAAGACACGTAGCTCCCATTGCTCATGTAGCTCCCGTGGCTTACCATGCAGCACCAGCTTACGCTCAAGAACACCAAGAATATTATTCCATCATTCGTCATGATAACCAACACGGATACAACCACATTCAACCTATCGCCGTCCACTCAGCTCCAATCTCATACCATCAACCTGCTATAGCTGTTCACGCTGCCCCAGTCTCTAGTCACGTCAGTGTTCAACACGCTGCTCCAGTCATTCAACATGTTGCTCCAGTCATTCAACACGTCGCTCCCATAAGACACGTAGCTCCCATTGCTCATGTAGCTCCCGTGGCTTACCATGCAGCACCAGTTCATGCTCAAGGACACCAAGAATATTACTCCATCATTCGTCATGATAACCAACACGGATACAACCACATTCAACCTATCGCCGTCCACTCAGCTCCAATCTCATACCATCAACCTGCTATAGCTGTTCACGCTGCCCCAGTCTCTAGTCACGTCAGTGTTCAACACGCTGCTCCAGTCATTCAACATGTTGCTCCAGTCATTCAACACGTCGCTCCCATAAGACACGTAGCTCCCATTGCTCATGTAGCTCCCGTGGCTTACCATGCAGCACCAGTTCATGCTCAAGGACACCAAGAATATTACGTAAGCCAATTCAACGCTCAAGTAGAAAATTCAGCTCCTTCAACCCATGTCCAGCCTCAACCTCACCTGGCGCCTTTATTATCTAAACATCCTCGTATATTGATAGGAAACGTTTCCAAGTGTGAAGCCATCAAATTGGAACATCAGCGTCCAGTGTGCGGTTGTCCGTTCGTCGATATGGAAACAAACCCAGAAGCTGTGGGATGTGTGGAGAGATTTGCAACATTTCTGTCATTTCTTCTTGTCATTATCACTTTCCCGTTTTCGTTATTCGAATGTTTTAAGGTCGTCCAAGAGTTTGAACGCGCTGTGATTTTTCGTCTCGGTCGAGTTAGAAAAGGCGGTGCAAGAGGACCCGGTTTATTTTTTGTACTACCATGTATTGATACATACAGGAAGGTAGACTTAAGGACCGTGTCATTTGATGTACCGCCTCAAGAGGTATTAACCAGAGATTCAGTGACCGTTGCTGTGGATGCAGTAGTTTATTACAGGATAAAAGAACCTCTTAATGCTGTAGTTCGGGTAGCTGACTACAGTGCATCAACCCGTTTGCTCGCCGCCACTACATTAAGAAATGTGCTGGGTATGCGTGACCTGGCTCAGCTATTGTCTGACCGAGAAGCTATCAGTCATATGATGCAAGCCAATTTGGATGTAGCAACGGATCCTTGGGGAGTAGAAGTAGAGAGAGTGGAGATTAAGGACGTTCGCTTACCAGTACAGTTACAAAGAGCAATGGCAGCTGAAGCTGAAGCTGATCGCGAAGCTCGTGCTAAAATCATAGCCGCAGAGGGAGAGATCAAGGCGTCAATTGCCTTGAAGGAAGCCTCGTTAGTTATGATTGACAATCCTATGGCGCTACAATTGCGTTACCTCCAGTCATTAAACACGATATCAGCTGAGAAGAATTCAACTATAATATTTCCTTTTCCAATGGATTTCCTTAAAACTTTTATGCCGTGTCCTGACGAAGAAAAGGAGCCACTAGCTGAAAAGATTTTTCTCATAATAGCTATATTAATGGTTATATTATTTCCTCCTTCCTTGATTTGTTGTTTTAGGGTTGTAAATCAGTACAAAAGAGCAGTTATTTTACGTTTTGGACGAGTTCGTCGCGATTCACCCGCCGGGCCTGGTATCATTTGGGTGGTTCCGTGCACTGATATAGTTTCACTCATCGACATCAGAACCCAATCTTTCAACTTACTGCCTCAAGAGGTACTAACAAAAGACTCTGTTACTGTCACAGTCGACGCCGTGGTATATTTTCATGTTATAAATCCATTGAACTGCTTGCTCAACGTCCATTCACACAAGCGTGCAACCGAATTGCTCGCTATAGCAATTTTAAGAAATATTTTGGGACAATATACACTGACAGATCTGCTTACAAATCGTGTAGCGATCAGTCAAGCGGTTAGTGAAGAAATTGATAAAGGAACAGCTGAGTGGGGCGTTCAAGTGGAACGCGTGGAAATAAAAAATGTGGTTCTGCCGTACGAACTGCAAAAGGCAATGGCAGCAGAGGCTGAAGGAACTCGAATAGCAAAAGCTAAGATTATAGAAGCAGAAGGTGAAATCAAAGCTGCAGAAAACCTTAGAGACGCGGCCAAAATTATGATGGAGAAACCAAAAACTATACTAGCATTAACTAAAGACTCACTGACTGTTTCTGTTGACGCTGTTGTTTTTTACAAAATAGTTGATCCCGTGCTTGCCGTAATTGGGGTAACCGACTACAAAGTATCCACACACTTTCTAGCTGCCACAACATTACGTAATGCCCTTGGAACGAGAAAACTCGCGGAATTGTTGGCAAGCCGTCCTGATGTCAGTCAGCAAGTATTCAATCTAATGAAGAATATTACGGTAGCCTGGGGAATCAAAATCGTTAGAGTAGAGATAAAAGACATAAGTCTACCGCTGCAATTGCAAAAAGCGATGGCGGCTGAAGCTGAGTCGACGAGATTGGCAAATGCTAAAATAATTGTTGCAAAATCAGAAATCGAAGCAACCAAAAGCCTTCAACTAGCTACAGACATTTTAATGGATAATCCAATGTGCATGCAACTCAGATATTTGCAATCGCTCAACATGATCGCCGGTGAGAAAACACATACAATTGTGTTCCCGTTTTCTGTTGACGTTATTAATAAAATAACAAGTTAA

Protein sequence:

>DPOGS215883-PA
MLSKVLTLVALLGAASAQHYSHGQATSSQSIIRHDNQHGYNHVQPIAVHSAPISYHQPAIAVHAAPVASHVSVQHATPVIQHVAPVIQHVAPIRHVAPIAHVAPVAYHAAPAYAQEHQEYYSIIRHDNQHGYNHIQPIAVHSAPISYHQPAIAVHAAPVSSHVSVQHAAPVIQHVAPVIQHVAPIRHVAPIAHVAPVAYHAAPVHAQGHQEYYSIIRHDNQHGYNHIQPIAVHSAPISYHQPAIAVHAAPVSSHVSVQHAAPVIQHVAPVIQHVAPIRHVAPIAHVAPVAYHAAPVHAQGHQEYYVSQFNAQVENSAPSTHVQPQPHLAPLLSKHPRILIGNVSKCEAIKLEHQRPVCGCPFVDMETNPEAVGCVERFATFLSFLLVIITFPFSLFECFKVVQEFERAVIFRLGRVRKGGARGPGLFFVLPCIDTYRKVDLRTVSFDVPPQEVLTRDSVTVAVDAVVYYRIKEPLNAVVRVADYSASTRLLAATTLRNVLGMRDLAQLLSDREAISHMMQANLDVATDPWGVEVERVEIKDVRLPVQLQRAMAAEAEADREARAKIIAAEGEIKASIALKEASLVMIDNPMALQLRYLQSLNTISAEKNSTIIFPFPMDFLKTFMPCPDEEKEPLAEKIFLIIAILMVILFPPSLICCFRVVNQYKRAVILRFGRVRRDSPAGPGIIWVVPCTDIVSLIDIRTQSFNLLPQEVLTKDSVTVTVDAVVYFHVINPLNCLLNVHSHKRATELLAIAILRNILGQYTLTDLLTNRVAISQAVSEEIDKGTAEWGVQVERVEIKNVVLPYELQKAMAAEAEGTRIAKAKIIEAEGEIKAAENLRDAAKIMMEKPKTILALTKDSLTVSVDAVVFYKIVDPVLAVIGVTDYKVSTHFLAATTLRNALGTRKLAELLASRPDVSQQVFNLMKNITVAWGIKIVRVEIKDISLPLQLQKAMAAEAESTRLANAKIIVAKSEIEATKSLQLATDILMDNPMCMQLRYLQSLNMIAGEKTHTIVFPFSVDVINKITS-