Monarch geneset OGS2.0

DPOGS215971
TranscriptDPOGS215971-TA3408 bp
ProteinDPOGS215971-PA1135 aa
Genomic positionDPSCF300078 - 630616-651040
RNAseq coverage283x (Rank: top 39%)
Annotation
HeliconiusHMEL0180423e-13356.37% 
BombyxBGIBMGA000928-TA1e-16869.68% 
DrosophilaCG5921-PD5e-5739.23% 
EBI UniRef50UniRef50_D2A1525e-7239.79%Putative uncharacterized protein GLEAN_07147 n=1 Tax=Tribolium castaneum RepID=D2A152_TRICA
NCBI RefSeqXP_974840.11e-7239.79%PREDICTED: similar to harmonin [Tribolium castaneum]
NCBI nr blastpgi|910815272e-7139.79%PREDICTED: similar to harmonin [Tribolium castaneum]
NCBI nr blastxgi|910815272e-10738.72%PREDICTED: similar to harmonin [Tribolium castaneum]
Group
Gene OntologyGO:00055151e-17protein binding
KEGG pathwayoaa:1000758752e-07 
 K05629 (AIP1)maps-> Tight junction
InterPro domain[25-113] IPR0014781e-17PDZ/DHR/GLGF
Orthology groupMCL18978 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215971-TA
ATGACAGCCGGGGCGACGGAGAGGCGCGTACGAGTCGTCCGTCTACTGCGACCGCGCAAGGCCGCTCCGGCGTTTGGATTCTCATTGCGCGGCGGGCGGGAGTATGCGACTGGCTTCTTTATATCGAAAGTGGACCTCAATTCGGAGGCCCATCTCCAAGGGTTAAAGGTCGGCGACCAGGTGGTTAGCGTCAACGGCTATCGGGTGGATGACGCGGTGCATGCTGAGCTGGTTCATTACATCACTTCACAGTCTCGGCTCAAAATTAAAGTTAGACATGTGGGGATGATTCCTGTTAAAGATAAGGAGCACGAGGCTCTATCGTGGCAGTTCGTCTCTGAACGAGCGTCTTTGGCTTCAGATGTGTCCTCATCGCCGACTCTCTGCCATAACACAGAACATTTGACGGACTTGCGACTGTCAATACTTGTACCACCAAGAGCAAAGCTTGGCTGCGGTATATGTAAAGGACCGGAATGGAAACCAGGAATTTTTGTTCAATTTGTTAGAGAAGGCGGTATCGCGCGAGAAGCCGGTTTACGGCCCGGTGACCAGATTATGTCCTGTAATGGCCATGATTTCTCGAACATATCTTTCAACGAGGCTACAATGAATCGAATTAAAAGACGAACATGGGACTGTTTAGATTTCGACTGGAAAAATAACGACATTCATAATTTTGACGCACCCTCCGATATCGAGCCAGATTATGACAGTCCTAGTATACCAAATAACCCTCATACTTATGAAAGCAAATCAAACATTAGAAGTAAATCTAAAACAGACATAGAAACAACAAAAATACCCAGCCAAATGAATAGGACCATCATAAATTTAACAGAAGATGGTGCAACTATTCAATGCAGTTACGACAACAACAATTCTAGAAAAGGTCCATTTACTACAAGTCAAACGAATAGAGAGAACGAAAAGACGATTGTTGTGGAAGTTCATCACATGAACACCGTCCCATGCTCTAAGACGGTTGGTAATTTCGGTCAAACAAAAAAGGACGTTGAGACGGACGCCACCAGCATAACGAGTTCGGCTACACTATCCAGTGCCATCGCTGATGAAATACAGAGGAGGAAATTAAATAAGAAACCTGATATAGAAAAGGAAATCAAACAAGTCCCGGTTAAGAAAACGAGCATCCCAAAGGCAATAGACAATGACCAAAAAAAACATCACGACGCCCTGATGGATGAATTCAAAAAGGTCCATAGGAAAATGTTTGCAAATCTTGAAGGAAAAAGTGAAGTTAATAATAGTGAAAAACAAACAGATAATATGGTCGGCGACCAGGTGGTTAGCGTCAACGGCTATCGGGTGGATGACGCGGTGCATGCTGAGCTGGTTCATTACATCACTTCACAGTCTCGGCTCAAAATTAAAGTTAGACATGTGGGGATGATTCCTGTTAAAGATAAGGAGCACGAGGCTCTATCGTGGCAGTTCGTCTCTGAACGAGCGTCTTTGGCTTCAGATGTGTCCTCATCGCCGACTCTCTGCCATAACACAGAACATTTGACGGACTTGCGACTGTCAATACTTGTACCACCAAGAGCAAAGCTTGGCTGCGGTATATGTAAAGGACCGGAATGGAAACCAGGAATTTTTGTTCAATTTGTTAGAGAAGGCGGTATCGCGCGAGAAGCCGGTTTACGGCCCGGTGACCAGATTATGTCCTGTAATGGCCATGATTTCTCGAACATATCTTTCAACGAGGCCATATCAGCGATGAAAGCAAGCGGTCGACTCGAGCTGGTAGTGCGTGAAGGCGCTGGTACAGAGCTGGTTTCGCCTGAAAGCTCCGGATATAATAGCTCTGCGTCATCCGCTGCTGGTGAGAGGAGCCCTGCGCCACCAGTGCCACTAGCGCCGCCAGCAGCTCTGAGAAGACGACTCGCCAGCGTCGCCGAGGAAGCTGCTGATAGAGCCGATAGGCTAACAATGAATCGAATTAAAAGACGAACATGGGACTGTTTAGATTTCGACTGGAAAAATAACGACATTCATAATTTTGACGCACCCTCCGATATCGAGCCAGATTATGACAGTCCTAGTATACCAAATAACCCTCATACTTATGAAAGCAAATCAAACATTAGAAGTAAATCTAAAACAGACATAGAAACAACAAAAATACCCAGCCAAATGAATAGGACCATCATAAATTTAACAGAAGATGGCGCAACTATTCAATGCAGTTACGACAACAACAATTCTAGAAAAGGTCCATTCACTACAAGTCAAACGAATAGAGAGAACGAAAAGACGATTGTTGTGGAAGTTCATCACATGAACACCGTCCCATGCTCTAAGACGGTTGGTAATTTCGGTCAAACAAAAAAGGACGTTGAGACGGACGCCACCAGCATAACGAGTTCGGCTACACTATCCAGTGCCATCGCTGATGAAATACAGAGGAGGAAATTAAATAAGAAACCTGATATAGAAAAGGAAATCAAACAAGTCCCGGTTAAGAAAACGAGCATCCCAAAGGCAATAGACAATGACCAAAAAAAACATCACGACGCCCTGATGGATGAATTCAAAAAGGTCCATAGGAAAATGTTTGCAAATCTTGAAGGAAAAAGTGAAGTTAATAATAGTGAAAAACAAACAGATAATATGGATCGCCAGTCAACGATTCCTCTTAGAAAAATCGAATCAGTGCCAGCAGAATTAATCAACAAGGATAAAGAAACGCCAGAAAAACGGGAGAAGAAGGCTCCGCCACCGCCTCCGCCGATGCCAATAGAAAATGGACACCATAATCATAATGGAGAACTCATCACAAACAATCACATCAACGACAATATGAAAAGTCCAGAAATAATGAAAGTGCCGAATTTAAGAAAAGTTGGTTCTTTAACTCGAATACCAACACCTGATTATAATAAATCGGACAGTCCCGTACAGCTCCGAGTGAAGACCGTTAAGGAAGAGAATGCTGAAGTAGAATCTTTAGAATCATATAAACTAAAAAATCCACTTAACATTCAACCAAAGCCGCCGTCTCATTACTTCATCAAAGCACCTAATGGAACTGCGACCATGAAGAAACATGTGCGACCAGTATCGGTTACAATAGGGGAATATGTCAACAACGTAGGTCGCAAAGAGCCGGCCAAGTTAGACTTCCTAAACGGAGACAAGATAGACGGGCATAGAGCAGCTCATGATGAACCTATCACAAGTAGACTTCAATCCGAATTGGCTCTCACATTGTCTAGATCAAATTTACGTAAAAAGACTGAAGCTTTGGTTCACGCTTCCTATAAAAAAAAACTTATATTGAACATAAATTCGAACGATGAAAACTTTCACGCTAGATACGGCAACAACCTTTTACCATTACCGCCATCGATATCGATACCAAAAATTACCAAAATAAATATCTAA

Protein sequence:

>DPOGS215971-PA
MTAGATERRVRVVRLLRPRKAAPAFGFSLRGGREYATGFFISKVDLNSEAHLQGLKVGDQVVSVNGYRVDDAVHAELVHYITSQSRLKIKVRHVGMIPVKDKEHEALSWQFVSERASLASDVSSSPTLCHNTEHLTDLRLSILVPPRAKLGCGICKGPEWKPGIFVQFVREGGIAREAGLRPGDQIMSCNGHDFSNISFNEATMNRIKRRTWDCLDFDWKNNDIHNFDAPSDIEPDYDSPSIPNNPHTYESKSNIRSKSKTDIETTKIPSQMNRTIINLTEDGATIQCSYDNNNSRKGPFTTSQTNRENEKTIVVEVHHMNTVPCSKTVGNFGQTKKDVETDATSITSSATLSSAIADEIQRRKLNKKPDIEKEIKQVPVKKTSIPKAIDNDQKKHHDALMDEFKKVHRKMFANLEGKSEVNNSEKQTDNMVGDQVVSVNGYRVDDAVHAELVHYITSQSRLKIKVRHVGMIPVKDKEHEALSWQFVSERASLASDVSSSPTLCHNTEHLTDLRLSILVPPRAKLGCGICKGPEWKPGIFVQFVREGGIAREAGLRPGDQIMSCNGHDFSNISFNEAISAMKASGRLELVVREGAGTELVSPESSGYNSSASSAAGERSPAPPVPLAPPAALRRRLASVAEEAADRADRLTMNRIKRRTWDCLDFDWKNNDIHNFDAPSDIEPDYDSPSIPNNPHTYESKSNIRSKSKTDIETTKIPSQMNRTIINLTEDGATIQCSYDNNNSRKGPFTTSQTNRENEKTIVVEVHHMNTVPCSKTVGNFGQTKKDVETDATSITSSATLSSAIADEIQRRKLNKKPDIEKEIKQVPVKKTSIPKAIDNDQKKHHDALMDEFKKVHRKMFANLEGKSEVNNSEKQTDNMDRQSTIPLRKIESVPAELINKDKETPEKREKKAPPPPPPMPIENGHHNHNGELITNNHINDNMKSPEIMKVPNLRKVGSLTRIPTPDYNKSDSPVQLRVKTVKEENAEVESLESYKLKNPLNIQPKPPSHYFIKAPNGTATMKKHVRPVSVTIGEYVNNVGRKEPAKLDFLNGDKIDGHRAAHDEPITSRLQSELALTLSRSNLRKKTEALVHASYKKKLILNINSNDENFHARYGNNLLPLPPSISIPKITKINI-