Monarch geneset OGS2.0

DPOGS209983
TranscriptDPOGS209983-TA3390 bp
ProteinDPOGS209983-PA1129 aa
Genomic positionDPSCF300148 + 306400-313534
RNAseq coverage513x (Rank: top 24%)
Annotation
HeliconiusHMEL0135470.063.80% 
BombyxBGIBMGA011269-TA0.060.97% 
DrosophilaPez-PA6e-14331.51% 
EBI UniRef50UniRef50_Q9V3V39e-14131.51%Pez n=7 Tax=melanogaster group RepID=Q9V3V3_DROME
NCBI RefSeqXP_001969999.11e-14231.72%GG23633 [Drosophila erecta]
NCBI nr blastpgi|1948624272e-14131.72%GG23633 [Drosophila erecta]
NCBI nr blastxgi|2420216542e-17035.86%conserved hypothetical protein [Pediculus humanus corporis]
Group
Gene OntologyGO:00054884.2e-20binding
GO:00064705.8e-20protein dephosphorylation
GO:00047255.8e-20protein tyrosine phosphatase activity
GO:00055156.3e-17protein binding
KEGG pathway 
InterPro domain[17-246] IPR0197495.9e-33Band 4.1 domain
[105-241] IPR0197481.8e-23FERM central domain
[107-213] IPR0143524.2e-20FERM/acyl-CoA-binding protein, 3-helical bundle
[905-1120] IPR0002425.8e-20Protein-tyrosine phosphatase, receptor/non-receptor type
[241-330] IPR0119936.3e-17Pleckstrin homology-type
[25-103] IPR0189793.8e-16FERM, N-terminal
[252-333] IPR0189803.1e-14FERM, C-terminal PH-like domain
[54-66] IPR0197504.5e-08Band 4.1 family
Orthology groupMCL11336 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209983-TA
ATGCCTTTCAAGCTCAAGTTGAAGAAGACACGGCAGTACAATGTAGCATCGAAGAGTCTGTTTGTCATAAGTGTGGAGTTGTTAGACGGCGGTGTAGCGGACTGCACTCTGTCGGTGGAGAGCACCGGCCAAGAGTGTCTCGACAACGTGTGCCAGCGCCAGGCGATCAACCAGCCGGAGTTCTTCGGATTGAGATATATAAACAGGAACCAACAGCCGCGATGGGTTCAACTCGACCGCCCACTAAAGCTCCAATTAGAAAAATATGCATCTTCACATCAATTGTATTTAAGGGTGATGTATTACGTGATATCTGGAACTAGTCTAATCACGGATGAAGTGACGCGGTACCATTATTTCCTGCAACTGAAGAATGATCTTGTGGAAGGTAGAATTATATGTGATTGCCAGCAAGCCGTGGTGCTGGCATCGTATAGTCGTCAAGCAGAGTACGGTAACCACGATCGAGAGCGACACACCGTGGAGTATCTAAAGAATTTGCTGACATTCCCCAAACAGATGTTAGACGGAGGTCAAGTACTAGACAGCAGCAGTCTAGCAGAGACGGGTCGCCTGGAGTGGCTGACGGCGGCTGTCATACAACATCACATGTCCCTGAACAACATGCCACAGAATGTATCAGATGCCATACATTATATCTTCTCCGAGCAGGCTCAGGCCGAGGAGGGTTTCATCACGACATGCCAGCAGCTCCGCGGCTACGGCCAGGAGATGTTCACCGCCAAGGACAAGAAGGACCAGGCGGAGGTGACCCTCGCCGTGTCGCTGACGGGCATCCGGGTACTCACCGACACCAGCGACACTCCACATTTCTACCGGTGGATGGACATAACGAACGTGATAAACCACAAGAGGACCTTCAGCGTGGAGTGTCAGCAGCGGGAGTCGGCCTCCTTCGTGCTGCCGTCGCCCGAGGACGGCAAGTACGTGTGGCGGATGTGCGTCATGCAGCACACCTTCTACATGCGGCACCAGCACTCCCTCTCCGCCGCCGCCGCCACCGCCACCGCCACTGCCGCTGCCGTCGCGGACCAGCACAGACCCAAGTTCCAGGAACACGGTCTGTCTGAGAGTCGCGAGGAGCTCGAGTCCCGCGACCTGCGCGAGCCTCACCCCGCCGCGCTCAGCACGGCTCGCGCCAGGTCCGCCTCGTGTCTGGAACTGGCTGAACATCGACCACATCTAGCCACCAGAGCCCTTCTTCCATCATACCGTCCAGCCCCGGACTACGAGACGGCGATCCAGCAGAAGTACCAGCAGCAGCGCGCTGAAGCCCAACTCCGCTATCAGAACCACTCTCATTCCCAACTCCTGGGGGCCAGCACTCAACCCCTCCTGTACGGCTCCCACCCCGATATACCTCGCGTTCACTACCCGGACGTGACGCGGCACACCGTGTCCGTGAAGCAGCCCGTCACCGCCGACGATTACTCCTACGGGCTCAAGTTCGTGGGGAACTACCCTGTGGCCGTCAACCCCAACGTGCAGACCGAGCACGCCCTCCACTACGTGAACGTGTACAAGCCGCCGCCCCCCTATCCGTCCAACGGTCTGGCCTCCAACTCGACCCCGGACCTGGCAGTAGCGAGTCAAGCTCTTAACTACCATCGGAGCTACATCGACGCGCACGTGTCCGGCTCCAGCCCCGACCTCGTCTCCACTAGGACCGCCCTCAACAGGCAGTACCTCGGTTATGTGAGTCCGCACAACGTCGTCAACTACGGCCGAGCGAACGTCCTGCCGGCGACTCACGGCACCTACAACAACCTCACGTCGGTGATGGAACCGAACCGCATCATCGTCGACCCCCACCTCGTATCGGACAATATACAGAAAGTCTACGACGACAGAGGGAACGTCATGTACTCCATGCCCACGAGAAGGATAGTGGTGGTGCCGACCGTGCGGCACGAGGAGCCCCAGGAACCGATATACGAGAACGTGCCTCTGCCGTGGACCTCGGACGGCGGCGCGAGAGGGAGGGCGCACAGTCTCACCGCCGCGGGCGAGGTGACCGGCTTCAACGACAGGAACGGGAATTGCGCCCTCGCCACCGCGCAGAGCCTCGCCTCCAAGATAGACGACTCGCACTATGTCAACGCCCAGGTCATCAAGGGTGCGAGGGAGACGAGCGACACGTCAGAACAGAACCACGACACGGGTGACGTCACCGCCGACCTGGACAGGATGTCGCTGAGGAACGAAGACAAAGACTACGATGAGACACCCTACACCAGTCTGTCCACGCAGAGAGTGACCAACAACAATACCATCAGCAAGTCGGCCGACAACGTCACGTCCATAATGGTGACGGGAGAAGACCACGAGCATGACGGGGCGGCCGGGGCTCCGGGAAAGGACGCGCGGGACTCCTCCTACAGCAGCAGCGTGGAGATGGATTGCAACAACGCCAGCAAGGAGAAGAAGAGGAGGAGGTGGGGGGTGTTCATGGGCAGGAGTAAGAACGTCGAGGTGAAGAGCGCCACTCTCGGGAGGGATAGGGCGAAGGCGGGACAGACGACCAACAAGCATCGCTGGTCCACTGGACTGCCCAAATATCAGCCGCTACCACCCAGCATTACCAAAGAGACTCTGTGTCAACTGTTGGAGCGCAAGATGTCGGAGGAGCAGCTGTCGTTCTCGTTCGAATCCATCCCCCGGGGCCGCGGCGGGGACGGGGACTCCGGCCGGCCTTACGACGACAACCGCGTCAGACTGCACCCCACGGCCGCCAACCCGCAAGGGTACATCGACGCATCGCACATCACAATGACGGTGGGAGGTTCCCAGCGGTTCTACGTGGTGGTCCAGGCGAACAGCGGCGAGCAGGCGGCCCTGCTCTGGGAGTGCGTGTGGCAGGTCGGCGCCAGGGTGCTGGCACTCGCGGATGGCCAGCAGCCGCGCTACCTGCCGCAGGACCAGCGGACCATCACCTACGGACAGTTCGAGGTGTGGTGCGGCGCGTGGTCGTGCACGTCGTGGGGCGGGTCGCGCCGGGTCACGGTGCGGAGAGGCGCCGGCCCGCCCAGGACGCTCTGGCACGTCCGCTGTGGCTGGCCCGCAGACACCACCGGCTTCCTGGACTTCATATCGGAGATCAACAGTCTCCGTGAGACGTGCGAGGCGGAGGCGGCGGCCGAGCGCTCCCTGACCCTGAACGCGCCGCTCGTGCTGGGAGGGCGCGCCGCCGGCGTCACGCTGGCCGCGGACCTGCTGCTGCACGTCATCGACACCAACCAGGAGCTGGACATCCCTCGCACCGTATCGTTGCTGCACCAGCAGCGCGCCGGCCTGCTGCCGAGCCTGCAGCACTACCGCTTCCTGCACCTGGTGCTGCTCCACTACCTCAAGCAGTCCCGCCTCATATGA

Protein sequence:

>DPOGS209983-PA
MPFKLKLKKTRQYNVASKSLFVISVELLDGGVADCTLSVESTGQECLDNVCQRQAINQPEFFGLRYINRNQQPRWVQLDRPLKLQLEKYASSHQLYLRVMYYVISGTSLITDEVTRYHYFLQLKNDLVEGRIICDCQQAVVLASYSRQAEYGNHDRERHTVEYLKNLLTFPKQMLDGGQVLDSSSLAETGRLEWLTAAVIQHHMSLNNMPQNVSDAIHYIFSEQAQAEEGFITTCQQLRGYGQEMFTAKDKKDQAEVTLAVSLTGIRVLTDTSDTPHFYRWMDITNVINHKRTFSVECQQRESASFVLPSPEDGKYVWRMCVMQHTFYMRHQHSLSAAAATATATAAAVADQHRPKFQEHGLSESREELESRDLREPHPAALSTARARSASCLELAEHRPHLATRALLPSYRPAPDYETAIQQKYQQQRAEAQLRYQNHSHSQLLGASTQPLLYGSHPDIPRVHYPDVTRHTVSVKQPVTADDYSYGLKFVGNYPVAVNPNVQTEHALHYVNVYKPPPPYPSNGLASNSTPDLAVASQALNYHRSYIDAHVSGSSPDLVSTRTALNRQYLGYVSPHNVVNYGRANVLPATHGTYNNLTSVMEPNRIIVDPHLVSDNIQKVYDDRGNVMYSMPTRRIVVVPTVRHEEPQEPIYENVPLPWTSDGGARGRAHSLTAAGEVTGFNDRNGNCALATAQSLASKIDDSHYVNAQVIKGARETSDTSEQNHDTGDVTADLDRMSLRNEDKDYDETPYTSLSTQRVTNNNTISKSADNVTSIMVTGEDHEHDGAAGAPGKDARDSSYSSSVEMDCNNASKEKKRRRWGVFMGRSKNVEVKSATLGRDRAKAGQTTNKHRWSTGLPKYQPLPPSITKETLCQLLERKMSEEQLSFSFESIPRGRGGDGDSGRPYDDNRVRLHPTAANPQGYIDASHITMTVGGSQRFYVVVQANSGEQAALLWECVWQVGARVLALADGQQPRYLPQDQRTITYGQFEVWCGAWSCTSWGGSRRVTVRRGAGPPRTLWHVRCGWPADTTGFLDFISEINSLRETCEAEAAAERSLTLNAPLVLGGRAAGVTLAADLLLHVIDTNQELDIPRTVSLLHQQRAGLLPSLQHYRFLHLVLLHYLKQSRLI-