Monarch geneset OGS2.0

DPOGS204381
TranscriptDPOGS204381-TA3036 bp
ProteinDPOGS204381-PA1011 aa
Genomic positionDPSCF300002 - 1738917-1744276
RNAseq coverage103x (Rank: top 60%)
Annotation
HeliconiusHMEL0130742e-5557.23% 
BombyxBGIBMGA007678-TA2e-2334.20% 
Drosophilaobst-E-PB4e-2737.91% 
EBI UniRef50UniRef50_F4WJU21e-3343.20%Chondroitin proteoglycan-2 n=8 Tax=Endopterygota RepID=F4WJU2_ACREC
NCBI RefSeqNP_001161915.19e-3440.21%cuticular protein analogous to peritrophins 3-E [Tribolium castaneum]
NCBI nr blastpgi|3320254174e-3343.20%Chondroitin proteoglycan-2 [Acromyrmex echinatior]
NCBI nr blastxgi|3320254173e-3643.45%Chondroitin proteoglycan-2 [Acromyrmex echinatior]
Group
Gene OntologyGO:00080615.1e-17chitin binding
GO:00060305.1e-17chitin metabolic process
GO:00055765.1e-17extracellular region
KEGG pathwaytca:6625042e-08 
 K01873 (VARS, valS)maps-> Aminoacyl-tRNA biosynthesis
    Valine, leucine and isoleucine biosynthesis
InterPro domain[920-994] IPR0025575.1e-17Chitin binding domain
Orthology groupMCL23316 Specific divergent
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204381-TA
ATGTATCTTAATCTTTGTTTAGCCGCTATCCTGATTGTCGTAGCATTTGCCGAAAGTGATTCCAATATTTGCAGTAATGTTATCGATTCTTACATTGGAGACCCACAAAGCTGCGACACCTACATCAGATGCCAAGCTCATCATCCAATTCATGCATCTTGTCCAGATGGATTAAATTTCAACCCTAAGGTGAAATATCCGAATTTTCCATGTAGTTATCCAGAGGACGTACCGTGCAATGGCCGAGCTTATTCAAATCCGCCTAAGCCAACAGCGGAATGTCCCAGACAGAATGGTTACTTTCCGGCGCCCGCTGCATCTAAACAGGACTGTGGACGTTATAGAGTATGTAAAGCTGGGAAAGCGATCTTCATGTCTTGTCCAACAGGACTCGCATTCAATCCAGCCACCGCCAAATGCGATTGGCCTGACCAAGTACCATCCTGCATCGCTAATGATTTCTTTGGTTTCTCCTGTCCACCCGGTACTGTAGATATTAGTGGCAACCCCATTATCACCAACCATAAACATCAAGATGACTGCTACAATTTCTTCTCCTGTGAGAACGGCCAAGCTCGTCTCCTGTCTTGCGATATCGGTTATGCCTTCGACGAGCGCTCCGGACTAAATGCTCAGGGCGGTGGAAGATTTAGTTTTCCCAGTCGATTTGATTTTCGAAGGCCCATAGTAAGACCATGGGCGAGACCTAGTGTATCTGCTGCACCAGATACTCCAAAGCCCACAGAAAGAAGTTCACCTACTGAAAAGAGTATAACCACTGAAAAACCTATAGAACCTATTCCAACAACGAAAGCTGTTTCAAGTACAATTTCTCCAATTGTCATACCCAAAACTGCTAGGCCATCCTTGACAAGAATGCCACCACGTGGACGTTTTGGGCCACAGCCTGTATTTGGGCCCAAACCTGGCTCCGTAGTGACGAATTCTCCAATCGATACATCGACAGGGGCCTTGGAAATAGTTACCCCGATGAAAATGTTTCCAGAATATTATCCAGTCCTATCAACAACGGAATCGAATATACTATCAACGAGATCACCGTTTGTTCCTAATCGTATAACAAGTAACAGTCACCAAAATTTTGATCCCAGATCTCAAAATTCTCTATTCATGTCTCATAATGCAAACGAAGATGATAAAACTGCTGAATATCAAACCGAAAAAGCATACAGTGTAGTCTCGAAAACAACATCAGAATTACCCTTATCTACTCAACAACCTTTAAAGGCCATGGAAAATTCTGTAAATAATAATCCAAATGTTTTGTCACCCTTAATAGCAACAGTATCCGAGACGGAAAAAATAGCTACAACTTCAAAAGCATCATTACTAGAAGAAACTACAAAATATGAACACCCTTCAAGTTTAAATCCAGACTGGTATACTAAATCCTACGTATCTAAACCTACAACAACAGTTCCATCGACTCCCAAAATAATTATTGTACCAGCAATTACAGTAACGTCAAATGCTCCACTTGATGAGGAGCACAACAATGAATTTTCAAATTTCAACATAAGAGATCGAGTGCCACCTTACACATCCCACGCCTTAAATGTACGGATACCTGAGAATCGACGAGATTTCCCTTCTAGTTCAAAATTTCCGTCAAAGGAAATTATAATAACAGCAGCAACTACAGAAAAACCACCATCTACAACAGATATAAAAGAATACCAAGGGAAAGAACCTTCTACAAACGAACCGTTAGTTAAAGTAGAAGAATCTCCATTTAAGATAGTATCAACAACTGAAGGCTCCCCGGCGAAGACTAATGTTTATAAATTATCAACTAACGGTATTAACAGCTACAAAGATAATTACTATGAGCAAACAACATTTGTACCTAAAATTATCTCCACATCAACAAAAGAACCAAAAGAATATGCTATACCTATAAGTACATCAGGCCCACTTATTGGCGTGCAAAAAATGGATGCTGACAGAGGCTTCACTCCAAAGAATCCTGAAATATGGGTAATGGAAAATTACGATAAAACTAAAACCCCAACGGACTCTTATAAATCCGACACACCCATTTACCCTATATATTCAAATAATAATGAAGCAAATTACGACGAACTACAACGATCTATAACAAATGCACCAACTATTATAAATAATCCCACAACAATACCTAAAAAATATACTATACCGACAACATCTAAACCATGGATTATTGATGATAACAAGAAATACGTTTTGACCACAGAAACTCCAATTGAATATACCAGTCAGTATGAAATAACAACAGAACCAACGTATAAATTAGATACGTATTCCACTCGTTCTGTTTATCTAAACAATAATGAAGAGAATTACTTTAAAGAAAATGAGAAACCAGAGGTCACTAATAAGACCAATTTAACGCAATCATATAGGGCCTGGTATCATAAAACCGATACGCCACCAACTAAACCACCAAAGGCGACTATAATACAGCTCTCGAATAACAACCGGTACAGTAAAACGACGACTGCACCTAAGCCATCAGTTCGTTATACCACGCCCCAACCAAGGAATTCCTACGTAGAAAAAGTGTATGACATAGGTAACTTTAAATGTAAGGACGATGGATTCTATGCAATAACAAATCAATGTGACGACTTCATCGAATGCAAGTCTGGAGTCCCTATTCAAAACTCTTGCCCTGATGGACTTCATTTCAATCCGGCAGCTAAACACTCGGAATTTCCATGTTCCTACCCTTCAGAGGTTAAGTGCGAGAACCAAGCTGCCAGTCATAAGGCTCAACCAACTTCCGAATGTCCGCGTCGCTATGGCTACTTTTCTCTGCCGAGTGGTGGCTGTGACAAGTACATTATGTGTCAAGAAGGCCTGGCCACAGTGATGTCTTGTCCGCCAGGACTCGCCTTTAACATAGGCACAAGTAGTTGTGATTGGCCTTCAAATGTTCCCGACTGTGTGCCTGATGTTTTTGAAGGATTTATCTGCCCAGCGCCAGAGCTTGATGAAGACAGTAATCCTGTCCGCAGCATTTACAAATACAGGTAA

Protein sequence:

>DPOGS204381-PA
MYLNLCLAAILIVVAFAESDSNICSNVIDSYIGDPQSCDTYIRCQAHHPIHASCPDGLNFNPKVKYPNFPCSYPEDVPCNGRAYSNPPKPTAECPRQNGYFPAPAASKQDCGRYRVCKAGKAIFMSCPTGLAFNPATAKCDWPDQVPSCIANDFFGFSCPPGTVDISGNPIITNHKHQDDCYNFFSCENGQARLLSCDIGYAFDERSGLNAQGGGRFSFPSRFDFRRPIVRPWARPSVSAAPDTPKPTERSSPTEKSITTEKPIEPIPTTKAVSSTISPIVIPKTARPSLTRMPPRGRFGPQPVFGPKPGSVVTNSPIDTSTGALEIVTPMKMFPEYYPVLSTTESNILSTRSPFVPNRITSNSHQNFDPRSQNSLFMSHNANEDDKTAEYQTEKAYSVVSKTTSELPLSTQQPLKAMENSVNNNPNVLSPLIATVSETEKIATTSKASLLEETTKYEHPSSLNPDWYTKSYVSKPTTTVPSTPKIIIVPAITVTSNAPLDEEHNNEFSNFNIRDRVPPYTSHALNVRIPENRRDFPSSSKFPSKEIIITAATTEKPPSTTDIKEYQGKEPSTNEPLVKVEESPFKIVSTTEGSPAKTNVYKLSTNGINSYKDNYYEQTTFVPKIISTSTKEPKEYAIPISTSGPLIGVQKMDADRGFTPKNPEIWVMENYDKTKTPTDSYKSDTPIYPIYSNNNEANYDELQRSITNAPTIINNPTTIPKKYTIPTTSKPWIIDDNKKYVLTTETPIEYTSQYEITTEPTYKLDTYSTRSVYLNNNEENYFKENEKPEVTNKTNLTQSYRAWYHKTDTPPTKPPKATIIQLSNNNRYSKTTTAPKPSVRYTTPQPRNSYVEKVYDIGNFKCKDDGFYAITNQCDDFIECKSGVPIQNSCPDGLHFNPAAKHSEFPCSYPSEVKCENQAASHKAQPTSECPRRYGYFSLPSGGCDKYIMCQEGLATVMSCPPGLAFNIGTSSCDWPSNVPDCVPDVFEGFICPAPELDEDSNPVRSIYKYR-