Monarch geneset OGS2.0

DPOGS209832
TranscriptDPOGS209832-TA1680 bp
ProteinDPOGS209832-PA559 aa
Genomic positionDPSCF300117 + 667905-672015
RNAseq coverage32x (Rank: top 75%)
Annotation
HeliconiusHMEL0043462e-2953.28% 
BombyxBGIBMGA008054-TA2e-2231.17% 
DrosophilaCG8483-PA2e-1027.37% 
EBI UniRef50UniRef50_UPI0002247D729e-1829.35%UPI0002247D72 related cluster n=4 Tax=unknown RepID=UPI0002247D72
NCBI RefSeqXP_001603551.14e-1629.44%PREDICTED: similar to sol i 3 antigen [Nasonia vitripennis]
NCBI nr blastpgi|3454980152e-1729.35%PREDICTED: venom allergen 3-like isoform 2 [Nasonia vitripennis]
NCBI nr blastxgi|1984431451e-1631.09%Chain B, Crystal Structure Of The Major Allergen From Fire Ant Venom, Sol I 3
Group
KEGG pathway 
InterPro domain[19-197] IPR0140449e-26CAP domain
[29-193] IPR0012831.3e-18Allergen V5/Tpx-1-related
Orthology groupMCL25417 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209832-TA
ATGAGATGGATCCTATTTATATTGATGGCTCTATTAATATCACAGACTGCTGGGCCGGGTCCAAATTGTAGAGGTTATATGGAAGCTCGTCTGTCATCGGAGGATAAAACGCGTATTATAGAGCGCTTGAACCGACGACGCAGCGAAACAGCGGTGGGACTGAGCAAGCTACCACCAGCCGGGGATATGTTGAAACTGCGTTGGGTGGAAGAGTTGGCTCGCGAAGCTGAAAGATGGGCGAACCAATGTCGGCCTCCAAACTCACCAGAAGAACAAGACTTGTGTCGAGATTTGTATTCCATAACGGTTGGTCAATGCGTCGCTTCCGTTGTGGGCGAAGCGCCAGGCCTTCGCCCTGAAACGATGGTCGATATGTGGTTCATGCAGAACAAATACTACAGAGGAAATGTCACTTCATATTTGCCATCGGAGGACGGCGCTAATTCCTATGACGATTTTGCCCAAATGATTTGGTCTCGGACTTATATGGTGGGCTGTGGACGAAGCAGATTTATGACGGATTGGCGGGGACGTCTCCGGACTGTAGAGCGCTTGGTATGCAACTTTGCGCCTCGGGGCCCTGCAATCTATAGACCGCTGTGGAGCCCTTCAGACCCGGCCACCGCGTGCCCTCCTCGCGCTAGACCTGACCCAGCTTTACCAGCACTGTGTATTTATCAAAAAAATATGAATGAATTTAATGATGAGAATAGTGTACTCTCACTCGAAGATAACTTATTATTGGACTCATTGAATGATATCGAGAGAAATAAATCGCTAGATTATATCGGAAGCTTGGATGAAATTTATTTGACGAAACTCGCTATAATGACCATGACAAATAATGATTCTCCAGTATTGTCTCTTCATTCAGTAGAAAAAAGACATCATAATATGTTAGAATATAACAATAGTTCCGTTCCAAGTAAGGGATTGGATAACAATACATCAAGTATACGGATAGTAAAGAAAAAGGTGTATTTCGTTGGCCGTCCGAAGACTTACAAGGTGGAAGACTTAAATGATTTGAATGATGTAATAAATGAGAATAGAATAGAAGTAACTACCAGAGATATATATGACTACTATGAGTATAAGGAATTAGATGATTTGATAGAAACAACTCAGAGTACGATGAAAACATATGGAAATACAGTTGAAGAAACGAAAATATATGAGAGAGACGTTCTTAATGCAACTGATATTAGTGTATCTGCTAGTCCAATAACAAGCACATTGGAATCAATAAGCAATAGTGAAAATATAAATGTTCAAATTCATCGAAACAAAACTGGACTCGAGTCTGCGGAGCAAGTTGTCGAAGATAATTTTATAGACGATTATCTCACAGATGCTGAAACGGCTCGTCAGCTGCAAGAAGCCTTGGAGCGTATGGAAAGCAAATTAGCTACACCTTCATCTACACCTGGAAAGGTTCGCAGAGAGTTACGTAATTCAAATGAAAGAGACGAGGATTATAGAGTGGAAACGGAATCTCCTGTACACGTCGAGAAGAATAAGACAATAGATAGAGGTCCGATGCTCAGCATGGTGTTGAAATATATGCCGTACTTGAAAACGTATGAAAAGACCATTCTAGGGGATCCCAGCGCCAGCCGCGCCTCTCTACTAACGCCCTATTTAACATTACATTTTATTACGCATCTATTATTTTATTAG

Protein sequence:

>DPOGS209832-PA
MRWILFILMALLISQTAGPGPNCRGYMEARLSSEDKTRIIERLNRRRSETAVGLSKLPPAGDMLKLRWVEELAREAERWANQCRPPNSPEEQDLCRDLYSITVGQCVASVVGEAPGLRPETMVDMWFMQNKYYRGNVTSYLPSEDGANSYDDFAQMIWSRTYMVGCGRSRFMTDWRGRLRTVERLVCNFAPRGPAIYRPLWSPSDPATACPPRARPDPALPALCIYQKNMNEFNDENSVLSLEDNLLLDSLNDIERNKSLDYIGSLDEIYLTKLAIMTMTNNDSPVLSLHSVEKRHHNMLEYNNSSVPSKGLDNNTSSIRIVKKKVYFVGRPKTYKVEDLNDLNDVINENRIEVTTRDIYDYYEYKELDDLIETTQSTMKTYGNTVEETKIYERDVLNATDISVSASPITSTLESISNSENINVQIHRNKTGLESAEQVVEDNFIDDYLTDAETARQLQEALERMESKLATPSSTPGKVRRELRNSNERDEDYRVETESPVHVEKNKTIDRGPMLSMVLKYMPYLKTYEKTILGDPSASRASLLTPYLTLHFITHLLFY-