Monarch geneset OGS2.0

DPOGS216107
TranscriptDPOGS216107-TA3177 bp
ProteinDPOGS216107-PA1058 aa
Genomic positionDPSCF300182 - 154708-165112
RNAseq coverage267x (Rank: top 40%)
Annotation
HeliconiusHMEL0154470.082.03% 
BombyxBGIBMGA009225-TA0.077.18% 
DrosophilaSbf-PA0.042.97% 
EBI UniRef50UniRef50_D2A1W90.051.77%Putative uncharacterized protein GLEAN_07785 n=2 Tax=Tribolium castaneum RepID=D2A1W9_TRICA
NCBI RefSeqXP_002422691.10.050.45%SET-binding factor, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2700056870.051.77%hypothetical protein TcasGA2_TC007785 [Tribolium castaneum]
NCBI nr blastxgi|3287805060.053.70%PREDICTED: myotubularin-related protein 13 isoform 1 [Apis mellifera]
Group
Gene OntologyGO:00163112.2e-15dephosphorylation
GO:00167912.2e-15phosphatase activity
KEGG pathway 
InterPro domain[1-150] IPR0220964.8e-51Myotubularin protein
[517-568] IPR0105692.2e-15Myotubularin-related
[292-336] IPR0041822.5e-06GRAM
Orthology groupMCL10595 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS216107-TA
ATGAATAGCGCTCTCCAATCTGCGAGTGCTTTGGACGAGCACTCCATAGCGGCGGCCATTTTGCCGCTGGCTACAGCATACTGCCGCAAGCTGTGCACTGGCGTTATACAATACGCGTATATGTGTATACAGGCTACATCTGCTATGTTTACACAGGAACACCAAGTGTGGACCAGCCAGCAGTTCTGGGAGGCTGCGTTTTACCAAGACGTCCAAAGGGACATCAAAGCTCTGTATCTACCAAGTCCTACACATCATAATAGGGTCTCCAGCCCAAGAGAGGATGACGAATACATATCTTTACTAAAAGCTCAAGAACCTAGCGCATTAGAAATAGCAGCTGAACAAATGAGAATATGGCCGACACTATCACCTGAAAAACAACGCGAGCTATTAGCCAGTGAGGAGTCGACGCTGTACAGTCAGGCCATTCACTATGCTAATCGTATGGTGTACCTGCTGCTGCCGCTGGAGGGCGCCGGGAGAAGGGACGTCCGGGATGACGACAGGAACAGCAATAGCATCACCAACAGTGTAGCTGAATCAGATAGTGCAGATGCGGAGTCGGGATTCGAGGAGGCTGATCCCGGAGAGGCGGGGAACAATGTTATTAAGATGGTATCGAGATTCGTTGACAAGGTGTGTACGGAAGGCGGCGTGACGGCGGAGCACGTGCGCTGTCTTCATCAGATGATACCCGGCGTCGTTCACATGCACCTGGAGACGTTGGACGGAGTGGCCAGGGAGAGTCGGCGGCTTCCACCCGTACAGAAGCCTCGGATAGCGTGGCCGACCTTAGTGCACGGCGAGGCCAGTGCGGGTGGCGCCCTACGCGCGGTGCTGCTCGCTGATGGCCGGAGTGCTGCACTGCCACAGCTGCTGCCAGCCGAGGGAGCGCTGTTCCTCACCAACTACAGGCTGCTGTTCAAAGGAGTTCCTGTCGACCCTTACGCATGCGAGGCGACGGTGGTCCGCTCGTTCCCCCTGAGCGCCCTCACCCGCGAGAAGGGTGTCCGCGCCGCGCCCGCCCACCTGGAGCACGCGCCTCATGACGCCTTGCAGCTCAGGGCTGCCACCTTCCAGCTTATTAAGGTGGCGCTGGACGAGGAGGTGAGCAGCGAGCAGGCGGAGTCGTTCCGTAAGGCGGTGTCTCGTCTGCGGCACCCTCCCCACCCCCTGCTGCACTTCGCCCTCGCGCCTCGAGCTGCACCCCCCAGCGACCTCGCTCAGCCTAAACACCACACACTCAAAGGATTCGCCAAAAAGACCCTCCTGAAGACGGCTCGTCGTGCCGGGTTGAAGCCCAAGCCATCCAAGCGACAGAAGTACGTGCTGCCGGCGGACGCTCGGTCATTGACTTCGCCACCCCTCATGGCTTCGCTCTCAGCTGATACATTGTCCTTGGAGGAGTTAGAGTTGGGTGGTGTGAGTCCTGAGGCGGACGCGTGCCGTTCCCTCGAGCGTGTCCGGGAGAGAATGTACGCTAGGGACTGGTCCCGGGTAGGACTCGCCGCCTCGCCCTTCAGACTGGCGCACGCTAACGCACTCTACACACTCGCCAGGAGCTACCCGGCCGTGGTGGCGGTCCCGGAGTCTGTGTCCGACGAGTCTCTGCGCCGCGTGGCGAGGTGCTACAGACAGGGCCGGCTGCCCGTGCTCACCTGGAGGCATCCCAGGACCAGGGCGATACTGGCCAGGTCGGCGGCGTCTCACCACAAGGGCGTCATGGGCATGTTGAAGAGCTCCAGTCATCAGACACCTGCTCCGAGCGGCACAGAGACTAACTCCAGCTTAGAACAGGAGCGCTACCTATCGACACTAGTAGCTTTGACACCAGCTGCTCGCGGTGAGCTGGACGACTCCAGCCTAAGTCTGGACAGCTTGCTGCTGTGCTGTGAGGATAGTCACGTCAACCACACACCGCTGCTCACTAAGGCGGCTGGCACCTTGGGTGTCCTGGGTCGTGGTAGTGGGGGCAAGGGCGCCGGCGGACGGCACTTCGGTCGGTGGGGCTCCCTCAAGGATCGGCGACACGCCTCTCACCTCGAACTGTACACCCCGCGACAAAGACTCTCCACAGCTGACCTTGACTCCGTATCTGGTGAAGGCGGTAATTCCCGCCGCGCGGCCCTGTACGTGTTCTCGGAGAAGGGCTCGGGTGCTCCGTGTGCGGGGGCGGAGCCCGTGCCAGTGGAGTACCCGGACGCACGAGCCACAAAGCACGCTTTTAAGAAGTTACTACGGGCCGCCGCCCCCTCAGCCCCCGGGACCGATGAACCAGGATTTCTAAGGCTGGTGGAGGAGTCGGGCTGGTTGTGGCAGCTCCGTCAGCTGTTCCAGCTGTCGGGGGCGGTGGTGGACCTGCTGGACGTGCAGGCCGCCTCCGTACTGCTGTCACTGGAAGACGGCTGGGATGTCACCGCACAGATATCATCTCTCGCTCAAATATGCCTTGACCCCTACTACCGGACCCTGGAAGGCTTTCGGATCCTAGTCGAGAAAGAGTGGCTCGCTCTTGGACACAAGTTCCAACAGCGCTCCAACCTGGCCGCTACACCTCAGCAGGGATTCACTCCCACCTTCCTCATGTTCCTGGATGCCGTTCATCAGCTCCAGAAACAGTTCCCGCTGGCGTTCGAGTTCAACGAGATGATAGAGGCAAAGCACCGCATCCACAGTGTGTCGTGTCGCTTCCGAACGTTCCTCTTAGACAGCGAGGCTCAGAGAGTGGAGCTGGGACTCGCGCCGGGACTCGAGAGGAGGCCTGACGCTTCCAGGCTGGGTGTGGATACAGGAGGAACAGGGGGTGCGGTGGGAGGTTCCGGGGAGTCAGAGGAGGCGCGCTCAGCTCTGGGGCTGTTCGAGTACATAGAGAGACTTCATCAACGAGCTCCGCTGTTCTATAACCTGCTGTACACACCCGACCTGGACAACCCGGTGTTGCGTCCAGTGAGTGCCGTCAGTAGCCTGGAGGTGTGGGAGTACTACGTGAGCGAGGAGCTGTCTCACGGAGCGCCCTACGAGCCCGACCTGTGGGGGGACGAGCGCGGAGCCCCGGCACAGAGAAGACTCCAGCACCGGCCCATGCTTCTACACATTTACATAGATATAGTGGAAGAGAATCCTCACACACAAACATTAATACGTGTCTTTGTTTTTCATTTGAGTTATTTCTGA

Protein sequence:

>DPOGS216107-PA
MNSALQSASALDEHSIAAAILPLATAYCRKLCTGVIQYAYMCIQATSAMFTQEHQVWTSQQFWEAAFYQDVQRDIKALYLPSPTHHNRVSSPREDDEYISLLKAQEPSALEIAAEQMRIWPTLSPEKQRELLASEESTLYSQAIHYANRMVYLLLPLEGAGRRDVRDDDRNSNSITNSVAESDSADAESGFEEADPGEAGNNVIKMVSRFVDKVCTEGGVTAEHVRCLHQMIPGVVHMHLETLDGVARESRRLPPVQKPRIAWPTLVHGEASAGGALRAVLLADGRSAALPQLLPAEGALFLTNYRLLFKGVPVDPYACEATVVRSFPLSALTREKGVRAAPAHLEHAPHDALQLRAATFQLIKVALDEEVSSEQAESFRKAVSRLRHPPHPLLHFALAPRAAPPSDLAQPKHHTLKGFAKKTLLKTARRAGLKPKPSKRQKYVLPADARSLTSPPLMASLSADTLSLEELELGGVSPEADACRSLERVRERMYARDWSRVGLAASPFRLAHANALYTLARSYPAVVAVPESVSDESLRRVARCYRQGRLPVLTWRHPRTRAILARSAASHHKGVMGMLKSSSHQTPAPSGTETNSSLEQERYLSTLVALTPAARGELDDSSLSLDSLLLCCEDSHVNHTPLLTKAAGTLGVLGRGSGGKGAGGRHFGRWGSLKDRRHASHLELYTPRQRLSTADLDSVSGEGGNSRRAALYVFSEKGSGAPCAGAEPVPVEYPDARATKHAFKKLLRAAAPSAPGTDEPGFLRLVEESGWLWQLRQLFQLSGAVVDLLDVQAASVLLSLEDGWDVTAQISSLAQICLDPYYRTLEGFRILVEKEWLALGHKFQQRSNLAATPQQGFTPTFLMFLDAVHQLQKQFPLAFEFNEMIEAKHRIHSVSCRFRTFLLDSEAQRVELGLAPGLERRPDASRLGVDTGGTGGAVGGSGESEEARSALGLFEYIERLHQRAPLFYNLLYTPDLDNPVLRPVSAVSSLEVWEYYVSEELSHGAPYEPDLWGDERGAPAQRRLQHRPMLLHIYIDIVEENPHTQTLIRVFVFHLSYF-