Monarch geneset OGS2.0

DPOGS210451
TranscriptDPOGS210451-TA3423 bp
ProteinDPOGS210451-PA1140 aa
Genomic positionDPSCF300062 + 128431-132148
RNAseq coverage0x (Rank: top 98%)
Annotation
HeliconiusHMEL0151314e-15034.92% 
BombyxBGIBMGA001857-TA6e-13532.26% 
DrosophilaCG5756-PA3e-3136.92% 
EBI UniRef50UniRef50_B4MYL74e-3045.08%GK22055 n=1 Tax=Drosophila willistoni RepID=B4MYL7_DROWI
NCBI RefSeqXP_002066220.17e-3145.08%GK22055 [Drosophila willistoni]
NCBI nr blastpgi|1954365291e-2945.08%GK22055 [Drosophila willistoni]
NCBI nr blastxgi|3071852834e-2827.31%hypothetical protein EAG_06682 [Camponotus floridanus]
Group
Gene OntologyGO:00080616.2e-13chitin binding
GO:00060306.2e-13chitin metabolic process
GO:00055766.2e-13extracellular region
KEGG pathway 
InterPro domain[42-109] IPR0025576.2e-13Chitin binding domain
Orthology groupMCL21152 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210451-TA
ACTTTTGCAATACTCCGGAGCAGAGCACAAAATGAAAAAAGTCGTCGACCTTTTGACAATGGCCAAGACTTTGAAGTAAAGCTTGCCGTACCAGGAAAACCTGGAAATGACTACCCTGTTTTTCATCGAATTCCAAGAACCTCCTTTACTTGTTCAGGAAAAGAACCCGGTTATTATGCCGACATGGAAACAAATTGCCAAGTATTTCGAGTCTGTACATTCGGTTCAACTTTTGGCTACCAATCATTTCTATGTCCCAATGGAACCTTATTCAATCAAGCTGTCTACGTTTGTGATTGGTGGATGAATGTTAATTGTACAAAATCAAAAGAATTCTATAATTATAAAAGTGAATTATTAAATTTAAGGAACGGACCACATTTTATGAGGGATATTAAAAAAATGATAACACATCCTATGAGAAATCCTTACGATCAGAGCTTTGCCAAAGATCGTTTAATTATAATTCAAAATTACCAACCTCCGAATAGCCCTTTTAATAGTCTTTCTAAGAATTATGCACCAAAAAACTCCAGCGAAATAATATCTCAAGTAAAAAAAAATAGTACTACTGATGTCTTAACTGCTACTTCGAAACCTACTACCTACTACTATTCAAATTTTAATATTTTTCCAACAAGTAACAAAAGTTTTAGCCCTGTACGACAGCAATCTCAAAAATCATTTCCTCAACAAAAAGTAGGAACAAATTACAAACAACGTCAATTTTCTAACAGTCATTCGGGTCAAGCCTTACCAAATAAAACAAAAAATGTATTAAGTTACACAAAACGTTTGCAAAATATAGTAAAACCAACTCAGTCAAATCAAAGAGTTCAGCAGTTTTTGAATCATCATCATATCCCAGCTCCTTCAAGTCAAGAAAATAGATTAGTTATTAATGGCCAACAAAGCTATCCGACACAAAATACTTTAAGTAACAACCATTCCAAGGTACAACTAAATCGGCATATATTACAAGACAAAGAATTGTATTCCTACAAACACAATCAAGCACGCTTTTCAAAAGATACTAGGAGTCAGGTCAACATAGCATCACAACAAAACTTGGTGAGTAACGATAGCTCACCTACCCTTATAAGAAAAACTTTAGCGTTTCGAGAAATAATAAAAGATCCTAAAAACGGAACTCCAAGATCTAAAATAACATTTAAAACTTGGATTTTAAGACCCTCCAACAGTGAAAAATTATCCGCGGACCCAACGCCATACACTTACAATACACCAAAAGTTACCATAATTGACGATAACGATTCAGACGTTATTAGTAATACTTCTGAGGAATCTGAAGATCACACTGACCTTGATTTAGAACCCTATCAATATAATCCACCAACAAGCTCAAGCGCTCTAACTTATGAAACAACCACTGAAATTCCCTTAACAACTAAAACTAGTTTTAGCCCTTCAGAACCAACAAAACCTTCATTTTTATATCTACCTCCAACGACTTATAAGACACCGATACTTTTGTATAATTATCCAACAAACAATATTCAAACACNACGTCAGTATTTACTTCCAGAAATAATACAGAGCCCTCTCATAGAACTAAGTAGTTCCCCTGTACCTTTTATACAATCACATACAAACGTGAAAGAGCCTATTTTACCTCTAAGGAAAAAAAATTCAACACAAATAAATCACAACCATAATTTATTTACAAATACACTGCTAAAAAATAACTTAGAAATAGTAAAAGATTTACTGAAAGATACAAATAAATTATTTAGAATAATATCACCTAACAACATTTATGGGCTTAAACAAGAAATTAAAACAATTGATTATCTTGATGAAAACTTACCACAAAATATACAAGACCAGGTCCATGACAAAACGTTACAACAATCTCCTACAGAATTACCCAAAAATTCTAAACTCATTGCAACGCCATCAATTGTGTTAGAACCACCTGATGAGAGCTATGAATATAGTTTTAAAAATTCTAATAAATTATCTTATTTATCCTTCATAAAACAACCAATTATTCCTACAATAGAAAGAACAGTTTCCATTAAAATAACTATGCCCCAAAAAATTGCAGACTTTATATTTAAAAAAAATGTTTCTGATAATTTAGAAATCTTAAGCACAGAAAACACGAATTCCTTTGTGCTGGCTAATAAAATGCCAAACAAAGAAGACTCTCATCAATACGTTCCCATCGGCAAACTTGTTTGGAATAATAGTTCTGATACTTCTCCTTCTCAGGAATTGCTTTTTTCTTTTTTGGCAGACTCTATTAGTGCAGCTCAAGAACATAAAAATATTGCCAAACAGGAAAATTTTCAGCCAACACCAACACATTTCACATACCTAAATAAAAATGGAATAGGATCTATATCTGATAAAATATCACAAATGACATCAGAACAATTTTCAAACATAAAGCTCTCAAGTAACGATCAACTTTCGAGAAGAGCAAACCTAAATAATTTAAACAATGACTTACAAACCAGAGGCTACAAACAGATAGAAATCGATAATTCAGTAAATCGTCACCAGAACAAAACACATCAGGATTTGATTAATGCTAAGCAAATCGCCGGCAGTCATTTGTCTGATACAGATTCAAATTTGCAATCAAATGTGGAACCTATATACAGCGGTCAATTGTACCAACTTTCTGTTCCAGAAGTTACAAAACAGTTTTATAATTTTCTATCCCAAAAAAAGAGCAAATACAGTGTAAACTATGAAGGGAAGAATAACGAAAATAAATACAAACCATCACAATCTGAATTTGAAATGATAAAATCACAAATATTATCGCCCGAATCTAACCCTAGTAAACTAAATCAGAAATCAATAGACCAGAGAACTACAGCTTACGACTTTAAAGACAACATCATTATACCTAGCGACAGTATAGCTGCACAGATACATGACAACACAATAGGAATTATTCCTCATCCATTACAAAAAGATAAATTAATAAATTATAAGAAAGACAACATCTACTATATTTATACAAATCTAAACGATACTGATATAAACGACTTCAAGAGAAACAACATACTTAACAGGCCCTTCATTTCAAGACCAAACAGTAAACTTTCGGAGCTAATAGATAATATAATACCGTCAATTAAGTATGACCTTGAAACTGATATTAAAAAACAAACTACTCCGAATACATTGCAGCAAGATACATTTGGTATACAAAGCCAAGAGATTGGTGCTGATATCACTTACATAAACAATCATCCCGAAACAAGAAAGCCATTCGATAAATCATACCAGGGCCCCTCATCATACAATGCACCTCAAGGTACAGTTGGCAATTTGGAATTTAATAAAAACTCAATAGAACTCAATGACGATATAGAAAAGATCGATAATTATGAAATCAACGGTTATCCGAAACTCATTCCTACAAAACGATTTTCATTTAGATAA

Protein sequence:

>DPOGS210451-PA
TFAILRSRAQNEKSRRPFDNGQDFEVKLAVPGKPGNDYPVFHRIPRTSFTCSGKEPGYYADMETNCQVFRVCTFGSTFGYQSFLCPNGTLFNQAVYVCDWWMNVNCTKSKEFYNYKSELLNLRNGPHFMRDIKKMITHPMRNPYDQSFAKDRLIIIQNYQPPNSPFNSLSKNYAPKNSSEIISQVKKNSTTDVLTATSKPTTYYYSNFNIFPTSNKSFSPVRQQSQKSFPQQKVGTNYKQRQFSNSHSGQALPNKTKNVLSYTKRLQNIVKPTQSNQRVQQFLNHHHIPAPSSQENRLVINGQQSYPTQNTLSNNHSKVQLNRHILQDKELYSYKHNQARFSKDTRSQVNIASQQNLVSNDSSPTLIRKTLAFREIIKDPKNGTPRSKITFKTWILRPSNSEKLSADPTPYTYNTPKVTIIDDNDSDVISNTSEESEDHTDLDLEPYQYNPPTSSSALTYETTTEIPLTTKTSFSPSEPTKPSFLYLPPTTYKTPILLYNYPTNNIQTXRQYLLPEIIQSPLIELSSSPVPFIQSHTNVKEPILPLRKKNSTQINHNHNLFTNTLLKNNLEIVKDLLKDTNKLFRIISPNNIYGLKQEIKTIDYLDENLPQNIQDQVHDKTLQQSPTELPKNSKLIATPSIVLEPPDESYEYSFKNSNKLSYLSFIKQPIIPTIERTVSIKITMPQKIADFIFKKNVSDNLEILSTENTNSFVLANKMPNKEDSHQYVPIGKLVWNNSSDTSPSQELLFSFLADSISAAQEHKNIAKQENFQPTPTHFTYLNKNGIGSISDKISQMTSEQFSNIKLSSNDQLSRRANLNNLNNDLQTRGYKQIEIDNSVNRHQNKTHQDLINAKQIAGSHLSDTDSNLQSNVEPIYSGQLYQLSVPEVTKQFYNFLSQKKSKYSVNYEGKNNENKYKPSQSEFEMIKSQILSPESNPSKLNQKSIDQRTTAYDFKDNIIIPSDSIAAQIHDNTIGIIPHPLQKDKLINYKKDNIYYIYTNLNDTDINDFKRNNILNRPFISRPNSKLSELIDNIIPSIKYDLETDIKKQTTPNTLQQDTFGIQSQEIGADITYINNHPETRKPFDKSYQGPSSYNAPQGTVGNLEFNKNSIELNDDIEKIDNYEINGYPKLIPTKRFSFR-