Monarch geneset OGS2.0

DPOGS215502
TranscriptDPOGS215502-TA2322 bp
ProteinDPOGS215502-PA773 aa
Genomic positionDPSCF300518 + 25415-29676
RNAseq coverage14x (Rank: top 82%)
Annotation
HeliconiusHMEL0096261e-0626.75% 
BombyxBGIBMGA011035-TA5e-8136.87% 
Drosophila% 
EBI UniRef50UniRef50_E5SB911e-12834.69%Retrovirus-related Pol polyprotein from transposon TNT 1-94 n=4 Tax=Bilateria RepID=E5SB91_TRISP
NCBI RefSeqXP_001601956.11e-10632.38%PREDICTED: hypothetical protein [Nasonia vitripennis]
NCBI nr blastpgi|3392417654e-12834.69%retrovirus-related Pol polyprotein from transposon TNT 1-94 [Trichinella spiralis]
NCBI nr blastxgi|3392417658e-12334.69%retrovirus-related Pol polyprotein from transposon TNT 1-94 [Trichinella spiralis]
Group
Gene OntologyGO:00036761.2e-42nucleic acid binding
GO:00150741e-27DNA integration
GO:00036771e-27DNA binding
GO:00082702.9e-06zinc ion binding
KEGG pathwaycqu:CpipJ_CPIJ0005052e-07 
 K01951 (E6.3.5.2, guaA)maps-> Purine metabolism
    Drug metabolism - other enzymes
InterPro domain[461-639] IPR0123371.2e-42Ribonuclease H-like
[466-582] IPR0015841e-27Integrase, catalytic core
[369-370] IPR0130842.9e-06Zinc finger, CCHC retroviral-type
Orthology groupMCL10015 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215502-TA
ATGTCGTCGAACTATTTAGCTAACGTGCCAAAGTTAAAAGGACGCGAGAACTATGACGATTGGTGTTTTGCGGCGCAAAATGTGTTGGTATTAGAAGGCATGGCAGCGTGCATACAACAAGAGCTATCGGCAGAGGCTTCAGCAGCAGATTTAGCTAACGATGCGAAGGCTAAAGCGAAGCTTGTATTGACCATTGATCCATCACTTTACGTGCACATCAAGCAGGCAGTATCAGCGTACGAATTATGGAATACTTTGAGAACAATGTTTGATGATTCCGGCTACACAAGGAAGATAAGTTTACTACGAAATCTCATTAATATAAGGCTGGAAAACTGCGAGACAATGACACAGTATGTAACTCAGATCGTCGAAACGGGTCAGCGGCTTCAAGGCACAGGTTTCAAAATAACCGATGAGTGGATTGGTGCTCTGATGTTAGCAGGATTACCCGAGAAATATGCGCCGATGATCATGGCCATAGAACACTCCGGTATTGAGGTTAGTGCAGACGTCATAAAAACGAAATTGATGGACATGTGTTTGGAGGTTGGTACTACTTCCGGCTCCGAGAGTGCATTTATTGCGAAAGGAAGGCATCGAGTGAGAAATGGCAACCCTAGAAGCGGCCATAGTCAAAATCATTCACAAACTCGCTCAGAGATGTCATCAGTGTCAAAGAAGATAAAATGTTTTCGGTGTAAACAAAGCGGACATTATAAAAATCAGTGCCCGCAGGCCGAAACAGTAAAAAGAAAACAGACAAATGCGTTTAATGCAATTTTTCTTACCGGAAAATTCAATAGAGATAGCTGGTACATTGATAGCGGAGCAAGTAAACACATGACAGCATGTAAACATAACCTCGTAAATGTGTCGGAATTACCCAAAACCAAAGAAATTATTATCGCAAACCAGACGAAAGTTCCCGTGCAGTGTTCTGGCGACGTAAACATCGTTACAGTTGTGAACGAAGTGGAATATGACGTCGTCGTGAGAGATATTTTGTATATACCAGAGTTGACCACCAACTTGCTATCTGTGAGTCAGTTGATTGAACATGGCAACAGTGTAATTTTCAAGGAAAATGTTTGCTACATATACAATCAACAGAAGGAACTCGTTGGCCAAGCAGAATTAGTTGATGGGGTGTATAAGTTAATAACATCACAGTCACAGACAGAACAAACATTGGCAGCCACAGCAGTAGCATCAAGCGATATATGGCATCGACGCTTAGGACATATAAATAGTGACAGTCTAAACAAAATGAAAAATGGGGCTGTAGAAGGTATATCATTCCCAGAAAAAGCAAACATAAGTAAATCTTCATGTATTATTTGTTGTGAAGGCAAGCAAGCCAGACTACCGTTTCAGCATAGTACGTCAAAGACTGAGGGTGTACTAGAGGTAATTCACGCAGATGTTGGAGGTCCTATGGAGAAGCCATCCATCGGACAATCCAGATATTATGTTTTGTTTGTGGATGATTACAGCAAAATGTCATTTGTGTATTTTATGAAAGAGAAAAGTGAAGTACTCAAATATTTTAAGGAGTTCCAGACTATGACAGAAAAACAGAAAGGTAAAAAGATCAAAACTCTAAGGACAGATAACGGAGGAGAATTCTGTAGTCTAGAATTTGAGAAGTATTTGAAGGAGAGAGGAATTGTTCATCAGAAAACAAATCCGTACACGCCCGAACAGAATGGTATGTGTGAAAGATTGAATAGGTCAGTAGTAGAAAAGGCCAGGTGTTTAATCTTTGATACCAATTTAGATAAAAAGTTTTGGGCAGAAGCCGTTAATACTTCAGTTTATATAAGGAATCGATCTGTTGTCAAGGGACTAAATAACGAAACTCCATATCAAGTATGGACAGGCCAGAAACCCGATATTAGCCATCTACGTATATTTGGTAGCAAAGTAATGGTGCACATCCCGAAACAGAGACGTTTAAAATGGGACAGAAAGTCTAAACAGCTGATTCTAGTGGGATATGCAGAAAATGTGAAGGGATACAGGGTGTACGATCCTAGTACTAATTCTATAACAACTAGCAGAGACATTATCATCATGGAGGCTACCAATGATCCAGAAATGATTCAAATACCTATAGAATGCAAGGCTACAGCGGAGGAAAATGAGTATGGGGAAGAAAGTTACATCTCAGAGGAAGATGGAGAAGAAGATAGAAAAGATGAGACTTATGTGCCAGATATAGAAGCTTCTTCGATACCCACAGAAATAACCGCTATAAACATATTAATATTCGACATGAGTCTCTTTGTCAGGGTGAGTGTGAAGGTCTACTTGGCTTGA

Protein sequence:

>DPOGS215502-PA
MSSNYLANVPKLKGRENYDDWCFAAQNVLVLEGMAACIQQELSAEASAADLANDAKAKAKLVLTIDPSLYVHIKQAVSAYELWNTLRTMFDDSGYTRKISLLRNLINIRLENCETMTQYVTQIVETGQRLQGTGFKITDEWIGALMLAGLPEKYAPMIMAIEHSGIEVSADVIKTKLMDMCLEVGTTSGSESAFIAKGRHRVRNGNPRSGHSQNHSQTRSEMSSVSKKIKCFRCKQSGHYKNQCPQAETVKRKQTNAFNAIFLTGKFNRDSWYIDSGASKHMTACKHNLVNVSELPKTKEIIIANQTKVPVQCSGDVNIVTVVNEVEYDVVVRDILYIPELTTNLLSVSQLIEHGNSVIFKENVCYIYNQQKELVGQAELVDGVYKLITSQSQTEQTLAATAVASSDIWHRRLGHINSDSLNKMKNGAVEGISFPEKANISKSSCIICCEGKQARLPFQHSTSKTEGVLEVIHADVGGPMEKPSIGQSRYYVLFVDDYSKMSFVYFMKEKSEVLKYFKEFQTMTEKQKGKKIKTLRTDNGGEFCSLEFEKYLKERGIVHQKTNPYTPEQNGMCERLNRSVVEKARCLIFDTNLDKKFWAEAVNTSVYIRNRSVVKGLNNETPYQVWTGQKPDISHLRIFGSKVMVHIPKQRRLKWDRKSKQLILVGYAENVKGYRVYDPSTNSITTSRDIIIMEATNDPEMIQIPIECKATAEENEYGEESYISEEDGEEDRKDETYVPDIEASSIPTEITAINILIFDMSLFVRVSVKVYLA-