Monarch geneset OGS2.0

DPOGS216187
TranscriptDPOGS216187-TA3369 bp
ProteinDPOGS216187-PA1122 aa
Genomic positionDPSCF300080 - 445997-457877
RNAseq coverage154x (Rank: top 53%)
Annotation
HeliconiusHMEL0058420.079.46% 
BombyxBGIBMGA004537-TA0.069.16% 
DrosophilaPtp99A-PF3e-6153.21% 
EBI UniRef50UniRef50_E2A4275e-6353.07%Tyrosine-protein phosphatase 99A n=7 Tax=Formicidae RepID=E2A427_CAMFO
NCBI RefSeqXP_001121162.13e-6554.39%PREDICTED: similar to Protein tyrosine phosphatase 99A CG2005-PB, isoform B [Apis mellifera]
NCBI nr blastpgi|3214724382e-6455.56%hypothetical protein DAPPUDRAFT_48090 [Daphnia pulex]
NCBI nr blastxgi|3214724385e-6255.56%hypothetical protein DAPPUDRAFT_48090 [Daphnia pulex]
Group
Gene OntologyGO:00064708.2e-63protein dephosphorylation
GO:00047258.2e-63protein tyrosine phosphatase activity
KEGG pathway 
InterPro domain[215-458] IPR0002428.2e-63Protein-tyrosine phosphatase, receptor/non-receptor type
[359-455] IPR0035951.5e-32Protein-tyrosine phosphatase, catalytic
[49-156] IPR0089571e-08Fibronectin type III domain
[42-140] IPR0137839.8e-06Immunoglobulin-like fold
Orthology groupMCL25031 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS216187-TA
ATGCTCTCTACTCAGCATCAACATTCTAACGAACCATTCACACCGTATAAGATATGGGTGCGTGCGTTTTATAACATACCCTTAGCCGGTGTGATGTCGTCCGACCTCCTGGCCCGCCTGGGACCCCAGTCCGAGTCCCTCTACGTGTTGACGGACGTTAGACCTCCATCAGCACCTGTTATACTCAACCTGACCTGCGACCAACAGAATGGTATTCTCTACCTGCAATGGCGTCAACCTCTCGAGTACAACAACTCATTAGACCAGTACGTGGTGACGCTGAGGAAGATACCGGAACAGCAGCCGAGGACCAGACTCACGCTGTTCACCAAGAAGGAGGACATCGAGACTATGATCAGTGTTAAAGTGGATTTATCAAACTCTACAATGTACGAGGTGAAGATTTACGCCGTGACCTTATCAGTTGCGACGCCGAAAACACTTATCAATGGATCGGAATCACCTCCGAAGGATGTGTCGAGTGAGTCGTGCGCGGTGGTAGCGGCACGAGCGGGGATCGCGGATATGGAGGGCGAAGCGCCGGGTGCTCCGGCAGCACTGTTGGCTGCCGCCCTACTAGCCGCCCTAGCAGCCGGTGGAGCCGCCCTAGTTTACTGGAGATGTAGATCTCGTGTGAGCAAATGTATCAGCGCCGCTTACAATTACTTGGAAGAGGGCGGTGAGAGAGCGGCCAGGGCTCCACTAAATATAAACAAGAAACCTCATGGCGTCCCCCGTCAGATCAAGCTGGAGTGGGTAGTGTGGCGGCGGAGGTACATCGCCACTCAAGGCCCCACGCCAGCCACACTAGACGCCTTCTGGCGGATGATCTGGCAGCACAGGGTCTGCACCTTGGTCATGATCACCAACCTCGTGGAGCGGGGCAGGCGTAAGTGCGACATGTACTGGCCGGCGGGCGGGCGCGGCAGTTCCGCGGAGTTCGGCGGGATACACGTGACGCTGCTGTATGAGGACGTGAGGGCCGCTTACACCGTCAGACATCTCAGGGTCAAGAGTACAGTCGCGGGTAGCGAATCGTCGAGCGAGTCGAGTACAGCGAGTGGCGAGGGTCGCCACGTGGTCCAGTACCACTACACCGTGTGGCCCGACCACGGCACGCCACGGCATCCGTTAGCTGTGTTGCCGTTCGTACGGGCCGCTGCAGATCCGGCAACCGTGCTCGTTCATTGCAGTGCGGGCGTTGGAAGAACGGGTACATACATAGTGATAGACGCACAACTGAATCAATTAAAACTCACGGGAACCCTGTCGCCTTTAGGGTTCCTCTGTCGCGCGCGAACGCAAAGGAACCATTTAGTGCAGACCGAGGAACAGTATGTATTCGTACATGACGCTCTGTTAGAGTACGTGCGTTCGGGTAACACAGAAGTGGAGTTCACAAAAGCTAGGGAATATCTGGCGAAGCTTCTAGAACCGATATCAGAGGAGGAGCTAGCGGTTATGGACCTTAATCCTATAAAGCATAAGAGCGTTAACGAAATGAACGGCGAGAACGACATGTCGAGTGTCAAATCTATAGAATGTAGCGATAATATAGTAGAAAATGGAAGCAGTCAAGTGTCGATAAAAACTGACGAATTGAACAGCGAAAGTAAATCATCTGTGGACAACCAAGAGAAGGATGGATTGGTCAACGGCGATGACTCGGAGGGAGTTTATGATCTGGCGCCGAGGTCCACAGATACTTATAATAAGAAAATGGCGGCCTATAACAGCATGAATGAACAAGAGAAAGAGGAAATGCGCAGAGTAAACCGAGCCGAAAACTACGCGCTGTTGGAACGGATGCGCTCTTTATCGAACAGGCACCAACTGTACCAAGGACCTCCTCCTGTTAACTTGTTAGAGAAACAGAATCAGTTGATAACACGTTCGTGTGTGGAGGCGAGTGTGTGTGCTCGAGCTCCTCACAACGCTGATAAGAACAGACCGGGTGGCATCCTACCCTCAGACTCCGCCAGGGTCATGTTGGTACCGAAACCCGGTGTTGAAGGAAGTGAATATATAAATGCTTCGTGGGTTTGTGGCGTGCGTCGTGTGAGAGAGTACGCCGTGTGTCAACACACAGAGGCACCTGACCCGTGGCGGCTGTTGTGGGATCACACCGCACAACTAGTGTTGTTACTACACGATGATGAACATCCGGAGTGCGATGTGTTTTGGCCGACAGAAGATGAGAAGGAACTGTTCGTGGCTAATTTCCGTGCGAGTTTTGTGTCTAAGGAAGTATATGTGGCGCACAGGAGATCGGATAGGACGAGTCGAACAGACACACCCAGCGAGCCCGAGACCAACGGGTACAGAAGGCAGGAGGGCAGCGACTGTGCTGACGACGAACGACTCATACCTGATAATAATTCACCAGTTAATAATACGGAACCATCCTACAGGTTCGACAGGACGGAACTCCGATTGGAGCGTCTCAGTAACAGAGATCTGTCCGCCCGGAAATCCATAGCGAACGGAGATTTATTCTCGTCATTATCAGAGAAGAAGAACGGTCCAAAATCACCGAGAAGTCCATCGAAGATGTCGTTGAAGAACTTTAAGCTGAGCTCTCCCACCAAGTTCAAATTCCCCGAGTGGGGTTCTAGAGCGGCTGGTTCACCGCCAGATACTGCACCACCACCGCCACCTATCACACCATCGCTCACCGTAGAAGAGGAAGCTGAACTACGAAGACCCGTGTACACATTCGAAAAAGTAAAACACCTCCCAAACGTACCATCAGACAGGGTTATAGAAGTGACGAACGTGAGCGTGCATTCATTGCAGGACGATTACCAGTTAAGTGTTAAGTTCATAAAGTGCAGTGGCTGGTTGAAAGGTGCTACCACCAAATACAGCGCGGGTCGGCCGGATGATAATGAATACGTTCGTGCTGTGAGGCATTCCAGCGGGAGCGAACGGGAAGCGGCCATCGATAGGCTCATAGCGCCCTACCAGGATTCGTTCGCGTTGATAGAGTTCGTCGCTGGATGTCAGATGGAATACAAGAATGGACCGGTTGTTGTTGTTGACAAATACGGCGGCTGGCGAGCGTTAACTTTTTGTTCGCTAAGCGCTGCGTGTGGTGGAGTAAGGAATCCAGATATCAAGGAACCTGGCAGTGAGTGGGCTTCACCCTGTGTGGCCGCTGATTTATACTGCAGTAGTGCTCTGAACGCTCACGCCCGCTGCCAGGCCTCACCAAATTCCCCGGCCTCACAGACTTCTCAGACTTCTCAGGCCTCACAGAGCTCTCAGACATCTCAGGAACGTCCCGTGACCCACTCGCCGGAGGCTCTGCTGGCGGCGTACTGCGCCCTCACCGCGTACGCAACTAAACTACCGAGGCCCGATAGCTGA

Protein sequence:

>DPOGS216187-PA
MLSTQHQHSNEPFTPYKIWVRAFYNIPLAGVMSSDLLARLGPQSESLYVLTDVRPPSAPVILNLTCDQQNGILYLQWRQPLEYNNSLDQYVVTLRKIPEQQPRTRLTLFTKKEDIETMISVKVDLSNSTMYEVKIYAVTLSVATPKTLINGSESPPKDVSSESCAVVAARAGIADMEGEAPGAPAALLAAALLAALAAGGAALVYWRCRSRVSKCISAAYNYLEEGGERAARAPLNINKKPHGVPRQIKLEWVVWRRRYIATQGPTPATLDAFWRMIWQHRVCTLVMITNLVERGRRKCDMYWPAGGRGSSAEFGGIHVTLLYEDVRAAYTVRHLRVKSTVAGSESSSESSTASGEGRHVVQYHYTVWPDHGTPRHPLAVLPFVRAAADPATVLVHCSAGVGRTGTYIVIDAQLNQLKLTGTLSPLGFLCRARTQRNHLVQTEEQYVFVHDALLEYVRSGNTEVEFTKAREYLAKLLEPISEEELAVMDLNPIKHKSVNEMNGENDMSSVKSIECSDNIVENGSSQVSIKTDELNSESKSSVDNQEKDGLVNGDDSEGVYDLAPRSTDTYNKKMAAYNSMNEQEKEEMRRVNRAENYALLERMRSLSNRHQLYQGPPPVNLLEKQNQLITRSCVEASVCARAPHNADKNRPGGILPSDSARVMLVPKPGVEGSEYINASWVCGVRRVREYAVCQHTEAPDPWRLLWDHTAQLVLLLHDDEHPECDVFWPTEDEKELFVANFRASFVSKEVYVAHRRSDRTSRTDTPSEPETNGYRRQEGSDCADDERLIPDNNSPVNNTEPSYRFDRTELRLERLSNRDLSARKSIANGDLFSSLSEKKNGPKSPRSPSKMSLKNFKLSSPTKFKFPEWGSRAAGSPPDTAPPPPPITPSLTVEEEAELRRPVYTFEKVKHLPNVPSDRVIEVTNVSVHSLQDDYQLSVKFIKCSGWLKGATTKYSAGRPDDNEYVRAVRHSSGSEREAAIDRLIAPYQDSFALIEFVAGCQMEYKNGPVVVVDKYGGWRALTFCSLSAACGGVRNPDIKEPGSEWASPCVAADLYCSSALNAHARCQASPNSPASQTSQTSQASQSSQTSQERPVTHSPEALLAAYCALTAYATKLPRPDS-