Monarch geneset OGS2.0

DPOGS209880
TranscriptDPOGS209880-TA1584 bp
ProteinDPOGS209880-PA527 aa
Genomic positionDPSCF300413 + 70172-71875
RNAseq coverage12x (Rank: top 83%)
Annotation
HeliconiusHMEL0047469e-1927.00% 
BombyxBGIBMGA009435-TA2e-1535.79% 
Drosophila% 
EBI UniRef50UniRef50_B7S8P88e-8038.44%Retroelement polyprotein n=15 Tax=Endopterygota RepID=B7S8P8_9HYME
NCBI RefSeqXP_001810526.12e-3528.23%PREDICTED: similar to orf [Tribolium castaneum]
NCBI nr blastpgi|1907023803e-7938.44%retroelement polyprotein [Glyptapanteles flavicoxis]
NCBI nr blastxgi|1907023801e-7738.44%retroelement polyprotein [Glyptapanteles flavicoxis]
Group
Gene OntologyGO:00036761.8e-31nucleic acid binding
GO:00150748.5e-18DNA integration
GO:00036778.5e-18DNA binding
KEGG pathwaydre:1001515027e-14 
 K04228 (AVPR2)maps-> Neuroactive ligand-receptor interaction
    Vasopressin-regulated water reabsorption
InterPro domain[126-288] IPR0123371.8e-31Ribonuclease H-like
[130-244] IPR0015848.5e-18Integrase, catalytic core
Orthology groupMCL23319 Specific divergent
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209880-TA
ATGATCAGCAAAGACATGGAATGGTTGCAAATAGAACAAAGACGTGATGATTTACTTCGTCCTTTAATAGATAGCATGAGCAGCGACCACCCAGCCGCAAACTATACACTAAGAGAAGGTGTCCTAAAGAAAATTTTAACAGATCCTGTGCTTGGCCAACGTGAAGTTATTGTTGTTCCTAAATCATTTCAATGGAGTTTAATAAATTCATTTCACACCGCATTACATCATCCAGGATGGGAGAAAACTCTCCAAAAAATAAAAGAGACTTATGTTTTTGATCAAATGAGTAGTTTAATCAGAAGATTTGTTGAAAATTGTATAATTTGCAGAACATCAAAAGGCTCGTCAGGTAGCATACAAGTACGGCTTCATCCGATCCAAAAACCCACGGCAGCATTTCAAGTAGTACATATGGACATTACCGGGAAACTAGGAACTAGGAGCAGTGAAGGATGTGAAGAATACGTTATCGTAATAATAGATTCTTTTACTAAATATGTGCTTTTAAACTATTCAAATAATAAGAGTCCTTGCAGCAGCCTTGCAGCCTTCAAACGATTAGTACATCTTTTTGGAACACCAGTTCAGATCATGGTTGATGGAGGTCGAGAGTTTCTTGGTGAATTTAAAGTTTATTGTGATCGCTTTGGGATCAACATACATTCAATCGCACCAGGAATAAGCCGAGCTAATGGACAGGTTGAAAGAATCATGAGCACTCTAAAAAATGCCCTGACCATAATTAAAAACTACACTACCGAAAACTGGCAGACGGCTCTTGAAGCATTGCAACTCTCATTTAATTGCACGCCTCATAGAGTAACTGGAGTCGCACCTCTAACTCTTCTCACTCGTCGGCAGCATTGTGTGCCACCGGAACTGTTAAGGTTAATTGATTTTGAAAATGAATTTATAGATTTTGATGTGTTGGAGAAATATGTGCAACAGAAAATGTTGGCTTCGGCAGAATATGATAAACAGAGATTTGAAAAAAGTAAGGGTAAGCTACGTTCATTTAAAAAGGGTGATTATGTACTAATCAAAACTAATCCTCGTAATCAAACTTCTTTGGATCTGAAATATACTGAACCATACGAAATATACAAAATCTTGGAAAATGATCGTTACATGGTAAAACGTGTAACCGGTAGAGGCCGGCCGCGTAAGTTAGCTCATGATCAATTACGTCCAGCTCCAAATCCAGCAGCAACAGGAACCGTGTCGGCGGAGATTGATGATTCTCCACATCGCGACAATCCATTAAATGGTGAAGCTTCTGAAGATTTAGAAGTTGAATCCAATGAACGATCAATAAACTTGATCAAACATTTATTGCTTAGCAACAGCCTCTGTGGACCACGGACAAAAGTCCTAGACCACAGCATCGACCTACCTCAATCCCTTAGAGAACCTGTGGACCACGGCGTACGAGACCGCCCAGACCACAGCAACATCATAAAATCAATATCCTCTGTGGACCACGGACAAAAGTCCCAGACCACAGCATCGACCTACCTCAATCCCTTAGAGAACCTGTGGACCACGGCGTACGAGACCGCCCAGACCACAGCAACATCATAA

Protein sequence:

>DPOGS209880-PA
MISKDMEWLQIEQRRDDLLRPLIDSMSSDHPAANYTLREGVLKKILTDPVLGQREVIVVPKSFQWSLINSFHTALHHPGWEKTLQKIKETYVFDQMSSLIRRFVENCIICRTSKGSSGSIQVRLHPIQKPTAAFQVVHMDITGKLGTRSSEGCEEYVIVIIDSFTKYVLLNYSNNKSPCSSLAAFKRLVHLFGTPVQIMVDGGREFLGEFKVYCDRFGINIHSIAPGISRANGQVERIMSTLKNALTIIKNYTTENWQTALEALQLSFNCTPHRVTGVAPLTLLTRRQHCVPPELLRLIDFENEFIDFDVLEKYVQQKMLASAEYDKQRFEKSKGKLRSFKKGDYVLIKTNPRNQTSLDLKYTEPYEIYKILENDRYMVKRVTGRGRPRKLAHDQLRPAPNPAATGTVSAEIDDSPHRDNPLNGEASEDLEVESNERSINLIKHLLLSNSLCGPRTKVLDHSIDLPQSLREPVDHGVRDRPDHSNIIKSISSVDHGQKSQTTASTYLNPLENLWTTAYETAQTTATS-