Monarch geneset OGS2.0

DPOGS201143
TranscriptDPOGS201143-TA3291 bp
ProteinDPOGS201143-PA1096 aa
Genomic positionDPSCF300065 - 154189-170441
RNAseq coverage102x (Rank: top 61%)
Annotation
HeliconiusHMEL0104463e-16165.22% 
BombyxBGIBMGA003947-TA0.076.06% 
DrosophilaCG10064-PA4e-14138.88% 
EBI UniRef50UniRef50_Q7Q6D01e-17548.19%AGAP005915-PA n=3 Tax=Culicidae RepID=Q7Q6D0_ANOGA
NCBI RefSeqXP_001653637.12e-17846.97%hypothetical protein AaeL_AAEL009035 [Aedes aegypti]
NCBI nr blastpgi|1571205055e-17746.97%hypothetical protein AaeL_AAEL009035 [Aedes aegypti]
NCBI nr blastxgi|1582949923e-17248.04%AGAP005915-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00055151.1e-60protein binding
GO:00039642.3e-07RNA-directed DNA polymerase activity
GO:00037232.3e-07RNA binding
GO:00062782.3e-07RNA-dependent DNA replication
KEGG pathway 
InterPro domain[320-658] IPR0110461.1e-60WD40 repeat-like-containing domain
[340-654] IPR0159435.4e-60WD40/YVTN repeat-like-containing domain
[917-1041] IPR0004772.3e-07Reverse transcriptase
[488-527] IPR0016809.1e-07WD40 repeat
[621-653] IPR0197811.1e-06WD40 repeat, subgroup
Orthology groupMCL13439 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201143-TA
ATGGACGTGAAAAACTTAGAACCTTGTAGTATAATAGGTTTTGATGGCTCAGCAATAAAAGGTTTAAATGTTCATCCGGATGGCGATCACATCGTCTATCCAATTGGCAACAAGGTGTGCATTCAGGAATGGAAGAGTAAGAAGATGTATTTCCTCAGTGGACACACTAACAGTGTTTCGACTTTGGCCATTTCACCCAAAGGCACATACGTAGGGTCAGGACAAATAAACCACATCGGCTTCAAAGCCTCGACGAAATTGTGGGATTTTAAAAAGAAAGTTTTGATTGGGACTCATGAGCTCCATAAAGTGCGTGTTGAGGCTCTTTCGTTTTCGTCTGATGAGCGTTACATGATATCCCTGGGCGGGAGGGATGACGGCTGTATCGTCCTATGGGACTGTGTGGCGGGCGCCGCCACCGGCACCGCGGCCGCCGCCAAGCTTACCACCGGAGACGCCATGACACTAGTGCCGCTCAGTTTGAGAGTCAATTCATTTGTTACGGGAGGAGATTCGAATCTCCGTGTCTGGAACATTCGGGCCGGCAGCAAGAACTTGGACGTGGTGGACGTGTCCCTGGGCAAGCTGAGGCGATGCGTGCGATGCATCACCGTCAGCAACGACGACACTTACGGCATCGCGGGGACCACTTCCGGTGACATGATAAAGTTCTCCATCAACTATCCGTCTGACCCAAAAGTCTCCGTGGCTGATTGCAAACCTTCTCTTATCGGCTGTCTAGCGAAGTGTGGACCTTGGAAGAAAGGAAAACGGCCGGAATCTTTATGCTATAGTCAAGGTATAGAATCAATCCTCATTATGACTGACGGGTGTCTGATAGTCGGTGCCGGGGACGGTACCATCGACATCCTCGATGAGATAAATTGGAAAGGAGAGTTTCCCAAAGGAAAAATACTTGATCCAAACAAACCTTTCTTTAAAGCTCTAAGAAGCACAAAACTTGAAGGTAGCATCACGTCTCTGGAGCTGATGGAGTCTCCGTCGGTCCGCGGCCCGAGCCGGCTCCTGCTGGCCGGGACCCGCACCAGTGAGATATACGCCATCAACGTGGACACCTTCGCACCCACGCTGGTCGTCACGTGCCATCGATGTGCCATCAACGATATAGCGTTTCCCAGAGGTATGTCGGGCGTGTTCGCCACAGCGGGTGCGGGTGACGTTCGCGTGTGGTGTACCGAGTCCGGTATCGAACTGCTCCGTATCGTGGTGCCTAACTTCGTGTGTTCCTCCCTGCTGTTCACAGACGACGGGAAAGCTATCGTTACCGCGTGGAACGATGGCAACATACGAGCGTTCACTCCGCTCAGCGGCCGTCTGATCTACTGCATCTACAACACACACAACAAGGGGACATCCGCCCTGGACATGACGCACGACGGACGGACTCTCATATCTGGAGGCTGCGAGGGCCAGGTCCGCGTTTGGGACATCAGACCCGAATGCCAGAGTCTCAAGAAAGTGCTAAAGGAACACAAGAGTCCGGTGTCCGCCATACAAGTGTCGCCCAACGACACGGAGGCCGTCAGCGCGGGGACCGACGGATCTTGCATTATATGGGACTTGATTTCGTTGTCTCGTCGCCAGGTCATGTACGCGAACACTCTGTTCATGTGCGTGTGCTTCGAGCCGCGAGGAGTGCAGCTGCTGACGGGAGGCACGGACCGACGAGTGGCCTACTGGGAGGCGGGCAGCGGCAACCTGGCCAGGGAGCTGGAAGCGAGCAAAGTGGGAGCTATAAACGGAATACATATAACACAGAAAGGAGACCTATTCGTGACCGGCGGCAACGACCAAATGGTCAAACTGTGGAAGTATCAGGAGGGTATTTACACTCACATGGGTCTGGGTCACGCGGGCGCGGTCACTTCCTGCCGGTTCAGCCCCGACGCCAAGGTCATCGTCAGCAGCTGTGCGGCCGGGACTATCATAGTATGGAAGGTGCCCGAGATGTACGTCAACGACGATAAGCAATCGTCTGGACGAAAAAGTTCAGGGGAACCGAAAACATTACAGGAGGAGTTACAGCTTCCCTTAGATGATAAACCCGTCCACAAGGCGGCTGGCGACCGACCTAACAAGTTACCGGCCGCGCCTTCCAAGGTCGAGAAGATCGCTTCTATAGATACGAGTCGAAGTAGCGTCCATTCGTGTTGTCCGTGCGACTCGGCCAAATCTAATCAGTCCAAAAAAACTACACCCAAGAGTGGCCCCAAGTGTTGTCGACCGCCGGGTGACTGTCGATTCATGGAGAATGAGCTGGTTTTCTCAGAGAGCTTTCTAAGCGGATCGACCGAGTATTCTTTACGTGCGGATAAAAAGATGCCATCATACGACCACGTTCGGTCAGAACTATATGCGGAGATAGTGGAACACATTCTTGCTCAAAATAAAAACATCTTGACAAACGTTGAGCCCAAAGACAGGATCATTCGAGTGCGATACAGGAAGCGAGCTAGGAACCAACATGAATGCCACCCAATACTGAAAGTATCACCGATATTCCACAAAAACATACTGGAGGCTGGCAAAATTTACATCGAGCTACAAAGAATACCTACGAGCTACCTAGTCCAATTCACGAAATGCCTTGGTTTCGGTCATACGAGAACAGTATGTCAAGAGGAGCAAGAAAGGTGCAGCTACTGCAGGGCAGCCCACACATGGGAGAAATGCCCAGCAACAAAACATTTTTTCCTAACATCTCAGGTCGAAGCGATAGATTTTATTCAGGATCACGACTGGCTGGTCAAAATCGAAATTCACCAGGCCTACTTTCACCTGCTGGTCGCGGAGACACACAGGCGATTTCTTCGAGTGGTTTACAAGGAGGAGATCTTTCAATTAACAGCACTGCTGTTGGGCGTTTCCTCAGTGCCTCGGACTTTTGGAACAGTCACAAACTGGGTGGCCGAAATCCTCAGAAATCAGGGCATATGTCTAGTAGTGTACCTAGACGACTTCCTTTTAGCCAATCAGAACAGAAACAAGCTTATAGCACAGGTTGCGGAAACTCTGGCTATCTTAAAATCTCTGGGACGGTACTTAAATGTAAAAAAATCAATGACCGAACTAACTCACAAACTGGAATACCTAGGTCTGGTTTGGGACATTCAGAGTCAAATAATAGCACTGCCGACCCGAAAAGTTCTAAGCATAAAGAATTCCTTCAGTGGCTTGTTAACCCGAGAAAAAAAGTTCATTAAGGGAGCTTCAGAGTCAACTAGGCAAATTAAACTTTACCAACCATGCCATCGAGGTCGGCTTCACTGA

Protein sequence:

>DPOGS201143-PA
MDVKNLEPCSIIGFDGSAIKGLNVHPDGDHIVYPIGNKVCIQEWKSKKMYFLSGHTNSVSTLAISPKGTYVGSGQINHIGFKASTKLWDFKKKVLIGTHELHKVRVEALSFSSDERYMISLGGRDDGCIVLWDCVAGAATGTAAAAKLTTGDAMTLVPLSLRVNSFVTGGDSNLRVWNIRAGSKNLDVVDVSLGKLRRCVRCITVSNDDTYGIAGTTSGDMIKFSINYPSDPKVSVADCKPSLIGCLAKCGPWKKGKRPESLCYSQGIESILIMTDGCLIVGAGDGTIDILDEINWKGEFPKGKILDPNKPFFKALRSTKLEGSITSLELMESPSVRGPSRLLLAGTRTSEIYAINVDTFAPTLVVTCHRCAINDIAFPRGMSGVFATAGAGDVRVWCTESGIELLRIVVPNFVCSSLLFTDDGKAIVTAWNDGNIRAFTPLSGRLIYCIYNTHNKGTSALDMTHDGRTLISGGCEGQVRVWDIRPECQSLKKVLKEHKSPVSAIQVSPNDTEAVSAGTDGSCIIWDLISLSRRQVMYANTLFMCVCFEPRGVQLLTGGTDRRVAYWEAGSGNLARELEASKVGAINGIHITQKGDLFVTGGNDQMVKLWKYQEGIYTHMGLGHAGAVTSCRFSPDAKVIVSSCAAGTIIVWKVPEMYVNDDKQSSGRKSSGEPKTLQEELQLPLDDKPVHKAAGDRPNKLPAAPSKVEKIASIDTSRSSVHSCCPCDSAKSNQSKKTTPKSGPKCCRPPGDCRFMENELVFSESFLSGSTEYSLRADKKMPSYDHVRSELYAEIVEHILAQNKNILTNVEPKDRIIRVRYRKRARNQHECHPILKVSPIFHKNILEAGKIYIELQRIPTSYLVQFTKCLGFGHTRTVCQEEQERCSYCRAAHTWEKCPATKHFFLTSQVEAIDFIQDHDWLVKIEIHQAYFHLLVAETHRRFLRVVYKEEIFQLTALLLGVSSVPRTFGTVTNWVAEILRNQGICLVVYLDDFLLANQNRNKLIAQVAETLAILKSLGRYLNVKKSMTELTHKLEYLGLVWDIQSQIIALPTRKVLSIKNSFSGLLTREKKFIKGASESTRQIKLYQPCHRGRLH-