Monarch geneset OGS2.0

DPOGS200808
TranscriptDPOGS200808-TA3963 bp
ProteinDPOGS200808-PA1161 aa
Genomic positionDPSCF300249 - 32631-36593
RNAseq coverage10x (Rank: top 84%)
Annotation
Heliconius% 
BombyxBGIBMGA011035-TA1e-7235.43% 
Drosophila% 
EBI UniRef50UniRef50_E5SB910.034.71%Retrovirus-related Pol polyprotein from transposon TNT 1-94 n=4 Tax=Bilateria RepID=E5SB91_TRISP
NCBI RefSeqXP_001599944.19e-18031.93%PREDICTED: similar to copia-like retrotransposable element [Nasonia vitripennis]
NCBI nr blastpgi|3392417650.034.71%retrovirus-related Pol polyprotein from transposon TNT 1-94 [Trichinella spiralis]
NCBI nr blastxgi|3392417650.034.71%retrovirus-related Pol polyprotein from transposon TNT 1-94 [Trichinella spiralis]
Group
Gene OntologyGO:00036765.1e-43nucleic acid binding
GO:00150742.9e-29DNA integration
GO:00036772.9e-29DNA binding
KEGG pathwayuma:UM00214.14e-31 
 K00140 (E1.2.1.27, mmsA, iolA)maps-> Inositol phosphate metabolism
    Propanoate metabolism
    Valine, leucine and isoleucine degradation
InterPro domain[828-1068] IPR0131031.7e-87Reverse transcriptase, RNA-dependent DNA polymerase
[450-624] IPR0123375.1e-43Ribonuclease H-like
[455-571] IPR0015842.9e-29Integrase, catalytic core
Orthology groupMCL10015 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200808-TA
ATGACTGCGAACTATATTGCCAGTGTCCCTAAACTAAAAGGCAGGGAAAATTATGATGAGTGGAGCTTCGCGGCTGAAAATTTATTGGTTCTTGAAGGAATGGACAATTACATCAAGCCAACGGCAGGTTTTGAGGTTAAGCCAGCAGAAGATGCAAAAACGAAGGCAAAGCTTATATTGACTGTGGACCCATCGTTATACGTGCATATTAAGAATACCAAGAGCGCAGCAGAATTATGGACTACATTAAAGACCATGTTTGACGACTCTGGTTTCTCACGCAAAATAACATTGCTACGACATTTGATCTCTATACGCCTAGATAATTGTGATTCAATGGCTACATATGTGACTCAAATGGTCGAGACAGCTCAACGGTTGAACGGCACAGGTTTTACAATCACCGATGAATGGGTGGGTTCATTATTATTAGCAGGTTTGTCAGATCGTTATTCTCCCATGATAATGGCAATTGAGCATTCGGGCATTTCGATTACTGCAGATGTCATTAAGTCGAAATTGCTCGACATGGAAGTGACCAGAGACAATGACGCCGGTGCTGCATTTGCCGCGCGAAATAATCATTCGTTCAATAAGTCAAAGAAAGGCGGTCCCGGTCCTTCAACGTCAGTTTCAAACAAGAAAATAACAGCAGCAATGACAGATTCATCGAAAACAATTACATGCTACAAATGTAAAGAAGATGGTCATTATCGAAATCAGTGCCCTTTATTGAAGAAAAACAATAGTAAATGTGTTTTTAATGTCGTTTTCCTGAATGGAAAGTTCAATAAAACTGAATGGTACGTGGATTCCGGCGCCAGCGCTCATATGACGGCGAATGAATCTTGGGTAAAAAACATGAACCGATCACCGTGTTTACCAGAAATAGTAGTCGCAAATGAATCTACAGTGCCAATAGTTTGTTCTGGAGACGTCGACATTGTTTCAGAACTGAAATATGAAATTATAGTGAAGGATGTGCTGTGCGTTCCCAGTTTGACAACAAATTTATTATCCGTGAGTGAACTGATTAAAAACGGAAATAATGTGATATTTGATGAAAAACACTGCTACATTCGAGACAAGAATGATGTTTTGATTGCTACTGCCGACCTGTCAGATGGTGTTTATAAGTTGAGGTTAGAAACACAGCATTGTATGTTGGCCGCACCAGCAGTTAGTGGGAATCTGTGGCACAGAAGATTAGCACACCTAAATAGTCAAGATATGAAGAAGATGAGAAATATTGTGGATGGTTTGTCATATGAACAAAATTTTGACATAACCAAATCTCAGTGTACAACTTGTTGCGAGGGAAAGCAAGCCAGATTACCCTTTTCGCATGTTGGTGAGAGAAGCACAGAATTATTACAAAGAGTTCACACAGATATTTGTGGACCAATGGAGACCAGGTCATTGAATGGTGCTCGATATTTCATATTGTTTGTAGACGACTTCAGTCGGATGACTTTTATATACTTTATCAAGAATAAAAGTGAGACACTCAGCAAATTCAAAGAGTTTCAAACACTGGTTGAGAATCAGTTGAATAAGAAAATAAAGATGATACGTTCAGATAATGGACTTGAATTTTGTAATAAGGAGTTTGATAATTATTTGAAACAAAAGGGTATAATTCACCAAAGATCAAATAATTACACGCCTGAACAGAACGGTTTATGTGAGCGAGCCAACCGTACAGTTGTGGAGAAGGCTAGATGTCTATTGTATGATGCAATGGTGGATAAACGATTTTGGGCAGAAGCAGCGAACACTGCTGTCTATTTGAAGAATAGATCCGTTGCTTCAGGACTTCAGACAACTCCATACGAATTGTGGTATGGGAAGAAACCAGATCTCAGTCACATCCGTTTATTTGGAAGTAAAGTCATGGTGCATATTCCCAAGGAACGTCGTTTGAAATGGGATAAGAAAGCCATGGAGCACATTTTAGTGGGATATAGTGAAGAGGTCAAGGGTTACCGGCTGTATAATCTTGCGAAGAAGAGTCTTGTTATAAGTAGAGATGTCATCGTGATGGAAAATGAATCAGAGCTCAAAGAAACAGGCAATGAGTCTATATGGATTCCTAACGAGGAGGCACTTGCAGATAACATAGAAAAAGAGAGCACAGTTTCAGTGGGGGATGAATTGCCTCATGTTTTAGAGACAGAAGACTGCCCTTCAGATTCTTCTTCAGTGTATGAGGACGGTAATGAAACACTAACGGAACCTTCATCACCAGCCACAGACATATTAGACACAGAAGTCCAGATGACATCAACCCCAGAACAGACACCAGTAGTACCTGAAAAGCGTTCCAGGAAAGCACCTGAACGGTATGGTTGGTATGGTACATGTCTTTCCAGCACTGTTTCAGCATCAGAAGAGATTGTGTTCTCAGAGGCTCTCGAAGGGCCTGAGAGGGAACAGTGGAAGCGTGCCATGGCAGATGAGCTGCAGTCTTTTGAGGACAGCGATGCATGGGAGCTTGTTGACAACCCCGGTGATGTGACGATTGTGAGGTGTAGATGGGTGTTTAACAAAAAGTTTGATGTGGACAATAATGTGAGGTTTCGGGCTAGATTGGTGGCAAAAGGTTTCTCGCAGAAACCTGGAATTGACTATACGGACACATTCTCTCCTGTTGTTAGACATTCAACACTCAGATTATTGTTTGCTCTCAGTGTTAAGATGAATATGAGTATTGATCATTTGGATGTTACTACAGCATTCCTAAACGGTTTCTTGAAAGAGACTATTTACATGTCCTTACCTGAAGGTTTTGTGAACAAAAGTGGGGGAAAGGTTTTAAAATTGAAAAGAGCAATCTATGGGTTAAAACAGTCTTCCCTTGCGTGGTATGACAGAGTTAAAGATTTGCTATGCAAATTAGATTTTAAAAACAGTCTGTATGAGCCTTGTTTGTTCACAAAAACTAAAGGTGAAGTTAAAATTATTGTGGCTTTATATGTGGACGATTTTTTAATTTTTTCGAACTGTCCTGTTGAAAGTAAAAAATTAAAAGATACTTTAGGATCAGAATTTAAGCTAAAAGATTTAGGGCCTGTAAGACAATATTTGGGCATGAGAATAAATGTTAGTAAAAATGTAATTACTGTAGATCAGCAGCAGTATATTGATCAGCTATTAACTAGGTTCAATATGTTGGATTGTAAAATGCATAAAACACCAATAGAATGTAAATTAAATTTGGAGAAACCTGATAAATGTGTGCCTGATGTTCCATATCAGAAGTTAATTGGATCATTAATGTATTTGGCAGTATTAACCAGGCCTGACATTTCCTATTCTGTTAGCTATTTGAGCCAATTTAACAGTTGCTATGATAATACACATTGGCATTATGCTAAACGCATTCTTAAATATTTGCAATGTACAAAAACATACTGTTTAAAATATTTTAAGGACGGTAGCAAGTTAGAGGGCTTTGTAGATAGTGATTGGCCAGCGATGCCATAGATAGGAAGTCCTACACGGGCTTTTGCTTTACTATGTCGGGATCTGTCATCTCATGGCAAAGCAGAAAGCAAACGTCTGTATCGCTCTCTAGTACAGAAGCAGAATATACTGCTTTGTCTGAGGCAGCTCGCGAAGGAGTCTATTTGAGAAACCTGTTACATGAAATAACTGGTAACCTATCTGTAATACAGATCTATTGTGACAATCAGAGTGCTTTAAAGTTGTCATTAAGTAACCAAAGTCACAACAGATCAAAACACATAGATGTGAGATTCCATTATGTAAGAGATGTTGTTAAAAATAAATATATTAAAGTTATGTACTTGTGTAGTGAGGAAATGCCTGCTGATTTACTTACCAAAGGATTGTGTACTATTAAGCATTATAAGTTTATGTGTAAATTGGGCATTGTACCTAAATAAAATTGTTGTTTATTTTGATAAGTGGGGGTGTAATAATGTGTATCAAAATATGTAA

Protein sequence:

>DPOGS200808-PA
MTANYIASVPKLKGRENYDEWSFAAENLLVLEGMDNYIKPTAGFEVKPAEDAKTKAKLILTVDPSLYVHIKNTKSAAELWTTLKTMFDDSGFSRKITLLRHLISIRLDNCDSMATYVTQMVETAQRLNGTGFTITDEWVGSLLLAGLSDRYSPMIMAIEHSGISITADVIKSKLLDMEVTRDNDAGAAFAARNNHSFNKSKKGGPGPSTSVSNKKITAAMTDSSKTITCYKCKEDGHYRNQCPLLKKNNSKCVFNVVFLNGKFNKTEWYVDSGASAHMTANESWVKNMNRSPCLPEIVVANESTVPIVCSGDVDIVSELKYEIIVKDVLCVPSLTTNLLSVSELIKNGNNVIFDEKHCYIRDKNDVLIATADLSDGVYKLRLETQHCMLAAPAVSGNLWHRRLAHLNSQDMKKMRNIVDGLSYEQNFDITKSQCTTCCEGKQARLPFSHVGERSTELLQRVHTDICGPMETRSLNGARYFILFVDDFSRMTFIYFIKNKSETLSKFKEFQTLVENQLNKKIKMIRSDNGLEFCNKEFDNYLKQKGIIHQRSNNYTPEQNGLCERANRTVVEKARCLLYDAMVDKRFWAEAANTAVYLKNRSVASGLQTTPYELWYGKKPDLSHIRLFGSKVMVHIPKERRLKWDKKAMEHILVGYSEEVKGYRLYNLAKKSLVISRDVIVMENESELKETGNESIWIPNEEALADNIEKESTVSVGDELPHVLETEDCPSDSSSVYEDGNETLTEPSSPATDILDTEVQMTSTPEQTPVVPEKRSRKAPERYGWYGTCLSSTVSASEEIVFSEALEGPEREQWKRAMADELQSFEDSDAWELVDNPGDVTIVRCRWVFNKKFDVDNNVRFRARLVAKGFSQKPGIDYTDTFSPVVRHSTLRLLFALSVKMNMSIDHLDVTTAFLNGFLKETIYMSLPEGFVNKSGGKVLKLKRAIYGLKQSSLAWYDRVKDLLCKLDFKNSLYEPCLFTKTKGEVKIIVALYVDDFLIFSNCPVESKKLKDTLGSEFKLKDLGPVRQYLGMRINVSKNVITVDQQQYIDQLLTRFNMLDCKMHKTPIECKLNLEKPDKCVPDVPYQKLIGSLMYLAVLTRPDISYSVSYLSQFNSCYDNTHWHYAKRILKYLQCTKTYCLKYFKDGSKLEGFVDSDWPAMP-