Monarch geneset OGS2.0

DPOGS200554
TranscriptDPOGS200554-TA1956 bp
ProteinDPOGS200554-PA651 aa
Genomic positionDPSCF300119 + 122366-125920
RNAseq coverage31x (Rank: top 75%)
Annotation
HeliconiusHMEL0168644e-16248.65% 
BombyxBGIBMGA010770-TA1e-11437.84% 
DrosophilaCG14309-PA6e-2525.21% 
EBI UniRef50UniRef50_D6X2K35e-2625.00%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6X2K3_TRICA
NCBI RefSeqXP_001809953.11e-2625.00%PREDICTED: hypothetical protein [Tribolium castaneum]
NCBI nr blastpgi|2700130082e-2525.00%hypothetical protein TcasGA2_TC010672 [Tribolium castaneum]
NCBI nr blastxgi|2700130081e-2621.79%hypothetical protein TcasGA2_TC010672 [Tribolium castaneum]
Group
KEGG pathwaydre:5625738e-09 
 K07965 (HPSE2)maps-> Glycosaminoglycan degradation
Orthology groupMCL19589 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200554-TA
ATGAGACAATCGGATTGGAGGAGTTTTATAAAGTGGGCAAAAAACTCGGGCTTCGACCTGGTGCTCGCATTAAACAATCATCACAGGACTGGAGTCATGTGGGATGCCAACATCGCCCTCGACATGCTCACGGCAGCACAAAAACAACAAGTTGGGGAAATGTTCTGGCAACTGGGTTACGAATGTAGAAATCAAACTATTGAAGAATATCTTAACGATTTGGAAACTCTCCGAGTTATTGTGGAGACTTTCCCTTCTGGGATGTCTAGGAAGTGGAAAGTAGTGGGAGCTGACGTCAGCAAATGCCTAAATGGAAACTCCAAGAACGACTTTAAGGATTACGTCATCACGTCCAACGATATGATGGACGCCATATTTTTGGATGGGAATTCAACATCGCGGGAGCTGTCAGCTATGTCCCCTCGCGAATACTCCAAGCTACTTCAGCGCTTGAGCCGCAGCGACACTCCGCTGTGGGGGTCATCTAAGGCCTCGTCTCCGAGAGACCGGCTGTCATCTCTAGGGACAGCAGCAGCTAGCGGGTTCACTGTACATTTCCAGGAGTTGATGGAGGACGAGCTCTGCGAACCCTCATTGAACCTCTATACGTTCCTGTTATTTAAGCACCTGGTCGGTACCCGTGTCCTCTCAGTCTCTGCCCCAGTTCCCTCCCCCCTCGTATCTCCTCCTGGTCTCTCCCTATTTTGTCACTGTTCCTCACTCCGCGGTCGGCCGGTTCCAGGCGCCATCACCGTCTACGGGGTCAACGACCAACAACAACACGCAGCCTTCACTCTCAATATCACACAACATGATGGAGACATCCTGCAGTTCATACTGGAACATGATATGACAGGGAGTATTATTGTGAACGGTCGCCCAGCAACTCGCGATGGTCACATAAGACCAGTCATCAAACTGGGTCGCTCATATAAACCTCTTGTATTCACTCTGCCTCCTAAATCCTTGGGTTTTTGGGTGCTCGCTAACGCACAAATAACTGCATGTTACAATAAAACAACTAGACTTGATTATAATGAAATTAATAGACGCTTCGATGACGAAGATAATTTTATTAAAACTAAAAGATCTATAAAAGAAAATGAGGATTTCATTTCTCAAGAATCCGCCGAAAAATCAGATGGATTTAGTTCAGTGCAGAATAATATTGCTTTGAGAAAACGAATTGAGGACATTAACAGTGAACTCCGGAAAATATTTCAAAGCTTTGATAAAAAAAAATACAATGCTAACCGAGTGAGACGAGAAATGTGTGACGATGAAAATAATACACGTAAATCTAGAGCATTGAGTAGATTAAAATCACGGAGAAGTCACGAGAAGGCACGAGGATATAATGGTCTTGCAAAAATTTCAAAGTTTTCCAAAGATAAAGTGGGACGTATAAAGAATAAATTATCCAGACTACGGGACACAAAAAGAAATTCAGCTTCGAGGTCAACAAGAAAAACGGAAAATAAGTATATCACGCATAGAAAGAGCGACGGCGAACCGAAAACTAAAGAAAGTTCTTTAAAGAATGAAATTTTAGATAACACGAAAAGTTCTGATGAAAAGAAAACGAGGAATCGTAGAAGTTTATCCAAAAATAAATCAAACAGACACCTCGACGAAGAAGAAGACTCGTCGGAGAATGAAATAGAGAGTAGCAAAGAGAAAATTAAAATTGGTAAATTATTTAACAATCTTAAGAAGTTAAGTGAACTGCCGATAGAAATACAGAGTAAAGACAAGTTGGAAGATTATAAAGACGACGGTTCTGAAGAAGGCATCGTTTTAAAAACGAAGCTATCAGATGATAGTGCGACCATTGACATCACGGACAAAACCAAGTCAGGTCTCTTGAAATCCGCGTTACAAGATATACTCTCCCTCTTCGCCGATTTCAATAAAAACATTAACAGGCTATGGACCGCCATCACAATACTTGAGTAA

Protein sequence:

>DPOGS200554-PA
MRQSDWRSFIKWAKNSGFDLVLALNNHHRTGVMWDANIALDMLTAAQKQQVGEMFWQLGYECRNQTIEEYLNDLETLRVIVETFPSGMSRKWKVVGADVSKCLNGNSKNDFKDYVITSNDMMDAIFLDGNSTSRELSAMSPREYSKLLQRLSRSDTPLWGSSKASSPRDRLSSLGTAAASGFTVHFQELMEDELCEPSLNLYTFLLFKHLVGTRVLSVSAPVPSPLVSPPGLSLFCHCSSLRGRPVPGAITVYGVNDQQQHAAFTLNITQHDGDILQFILEHDMTGSIIVNGRPATRDGHIRPVIKLGRSYKPLVFTLPPKSLGFWVLANAQITACYNKTTRLDYNEINRRFDDEDNFIKTKRSIKENEDFISQESAEKSDGFSSVQNNIALRKRIEDINSELRKIFQSFDKKKYNANRVRREMCDDENNTRKSRALSRLKSRRSHEKARGYNGLAKISKFSKDKVGRIKNKLSRLRDTKRNSASRSTRKTENKYITHRKSDGEPKTKESSLKNEILDNTKSSDEKKTRNRRSLSKNKSNRHLDEEEDSSENEIESSKEKIKIGKLFNNLKKLSELPIEIQSKDKLEDYKDDGSEEGIVLKTKLSDDSATIDITDKTKSGLLKSALQDILSLFADFNKNINRLWTAITILE-