Monarch geneset OGS2.0

DPOGS202120
TranscriptDPOGS202120-TA1074 bp
ProteinDPOGS202120-PA357 aa
Genomic positionDPSCF300150 + 325576-326724
RNAseq coverage73x (Rank: top 66%)
Annotation
HeliconiusHMEL0129532e-1626.29% 
BombyxBGIBMGA006963-TA6e-15773.76% 
DrosophilaCG43088-PA9e-1025.54% 
EBI UniRef50UniRef50_D6WE711e-3135.94%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WE71_TRICA
NCBI RefSeqXP_001947014.15e-3028.75%PREDICTED: similar to Os05g0593000 [Acyrthosiphon pisum]
NCBI nr blastpgi|3228006607e-4663.97%hypothetical protein SINV_10073 [Solenopsis invicta]
NCBI nr blastxgi|3228006607e-4563.97%hypothetical protein SINV_10073 [Solenopsis invicta]
Group
Gene OntologyGO:00167887.8e-11hydrolase activity, acting on ester bonds
KEGG pathway 
InterPro domain[180-350] IPR0069127.8e-11Putative harbinger transposase-derived nuclease
Orthology groupMCL22143 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202120-TA
ATGGACAAAATAAAACTTACATATTTTTTAACCGAATATATACATAGTAGTGATGATAGTGATTCTAGTAGTTGGAGTGATCTAAGTAGCGTGAAGTCTGAGTTTTACGAGGACGATGAGGAAGATAGACTGTTTATCCCTCTTATGCAGTATCTTATAAGAATAAAAAGAAAACGAGTTGACGATTATTTACATTTTGTAGAATCGTGGACAGATGCAGAATTTAAAAATCGTCTTATATTATCACGCAAAACTGCTTATAAACTTATTGATGACTTAGAAAAATCAGGATTTATAGCTTCACACAAGTTTGGTTTAAAACCTTTGGAACCAAAGTTATGTTTCTATATATTCTTATCCTTCATAGCTGACACGGAACCATTGACACCATTAGCAAATAGATTTGATATATCTATATCATCAACATTCAGAGTGCTAAGGAGAGTAGTTGCTTGGTTATTAACAAAATTAGATGATGCTATAAAATGGCCACAAGATTTTAACGATGTAGAGACAATATGTGAGCAATACCACTTCAAGACGGGAATATCTAATATATTGGGTGTCATTGACTGCACTCATATAAAAATAGAGAAACCAAGAAATGCAAGAGAGTATTGTAATCCAAAAGGATACTTTTCCATAGTCTTACAAGCCACCATAGATGCTAATCTGCGCTTCACTAATATCTATTGTGGTGAACCAGGGTCTTCCAACTGTTCTAGAGTATTAAGGAAATCCCCATTGTATCAAACAGCCACTCAGAATAGAGATACATTGTTTCCCCATAATACATTTTTAGTGGGACATAGTGGATATCCCTCATTATCTTGGTTGGTTCCACCATTTAGAGAAAATAAAAGACTAACATCACAACAACGGGAATTTAATTCCCTTCATGCATCAGCTAGAAAATTAAGTGATAAAGCTTTTAACTTATTAAAAACAAAATTTAGAAGAATCAAACTATTTACAGTTTATAGGAATATACCATTTATTACTGATACTATTGTTGCTGCATGTATTTTGCATAATTACTGTCTAGATGAAAGCTGTGATCCCTCGGAGGAATAA

Protein sequence:

>DPOGS202120-PA
MDKIKLTYFLTEYIHSSDDSDSSSWSDLSSVKSEFYEDDEEDRLFIPLMQYLIRIKRKRVDDYLHFVESWTDAEFKNRLILSRKTAYKLIDDLEKSGFIASHKFGLKPLEPKLCFYIFLSFIADTEPLTPLANRFDISISSTFRVLRRVVAWLLTKLDDAIKWPQDFNDVETICEQYHFKTGISNILGVIDCTHIKIEKPRNAREYCNPKGYFSIVLQATIDANLRFTNIYCGEPGSSNCSRVLRKSPLYQTATQNRDTLFPHNTFLVGHSGYPSLSWLVPPFRENKRLTSQQREFNSLHASARKLSDKAFNLLKTKFRRIKLFTVYRNIPFITDTIVAACILHNYCLDESCDPSEE-