Monarch geneset OGS2.0

DPOGS200032
TranscriptDPOGS200032-TA1710 bp
ProteinDPOGS200032-PA569 aa
Genomic positionDPSCF300337 + 138677-148597
RNAseq coverage3159x (Rank: top 4%)
Annotation
HeliconiusHMEL0036792e-12089.58% 
BombyxBGIBMGA012429-TA4e-12491.67% 
DrosophilaHrb27C-PB1e-11563.71% 
EBI UniRef50UniRef50_P488092e-11363.71%Heterogeneous nuclear ribonucleoprotein 27C n=27 Tax=Arthropoda RepID=RB27C_DROME
NCBI RefSeqXP_966757.22e-11760.37%PREDICTED: similar to hrp48.1 [Tribolium castaneum]
NCBI nr blastpgi|1892417023e-11660.37%PREDICTED: similar to hrp48.1 [Tribolium castaneum]
NCBI nr blastxgi|1954381861e-13257.97%GK24781 [Drosophila willistoni]
Group
Gene OntologyGO:00036765.8e-23nucleic acid binding
GO:00001664.7e-21nucleotide binding
KEGG pathwaypop:POPTR_10894604e-47 
 K12741 (HNRNPA1_3)maps-> Spliceosome
InterPro domain[14-86] IPR0005045.8e-23RNA recognition motif domain
[89-184] IPR0126774.7e-21Nucleotide-binding, alpha-beta plait
Orthology groupMCL14119 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200032-TA
ATGCGTATGAATCCAGACATGGACGATGATGAGAAGGGAAAACTGTTTGTTGGCGGTCTATCATGGGAGACATCGCAGGAGAATCTGCAGCGTTACTTCTCCCGCTACGGCGACGTGATTGATTGTGTTGTTATGAAGAACAGCGAGTCTGGCCGTTCAAGAGGTTTCGGTTTTGTTACCTTTGCTGAACCCTCACTGGTCAATGTCGTGCTTCAGAATGGTCCCCATCAACTCGATGGCAGGACAATCGACCCGAAACCGTGCAATCCAAGGACTCTTCAGAAGCCCAAGCGCGGCGGCGGCTATCCGAAGGTGTTCCTCGGAGGTCTGCCATCCAACATCACCGAGACTGACCTGCGCGTGTTCTTCGGACGGTACGGCAAGGTCATGGAGGTCGTCATCATGTATGACCAGGAGAAAAAGAAGTCTAGAGGCTTCGGATTTCTGTCATTTGAAGATGAAATCTCTGTTGAGAGAGTCACCCAGGAGCATTTCATCAACCTGAACGGCAAACAGGTCGAGATCAAGCGCGCGGAGCCTCGCGATGGTTCCGGCAAATTGGGCTCCGGAGGAGGCATGGGCGGGGGCATGGGGGGAGCGCCCGGAGATGCGCCCGCCGCGGGACAGTGGGGACCACCGCAAGCCGCGCCCATGAATATCATACAGGGCCACAACGGACAGATGGGCGGCCCACCGATCAACATGCCCATGGCCGGACCTAACATAATGCAGGGATACCAGGGCTGGGGAACGTCAGCCGGGCAGACTTCGTACGCCGGGTACGGTGCGGCGGGCGGCGGCGCGGGCCCTGGCAACTACCAGGGCTGGGGAGCTCCGCCGGCGCCTCAGGCCCCGCCGCACGCGCCCGCCTGGCCCGCCACCAACAACTACACTCAACACGCTCAGCCGCCCGCCCAGGGATACGGCAGCTACGCGAACTACAGCTCGGCGCCGGCGGGCGCCTCGGCCGGTGGCAGCTGGACTAATTGGAGCATGCCGCAGAACTCCAACTCTACCGGCTCCGGATCGTACGTTCCGTTGTCGGAGGGCGGTGAGATGTACGGTCGCGGTGGTGGCGGTGGCACGGGCGCGGCGCCGGCTCTGGCCGGGGCGCTGTCCTCCGCCGCCCTCTCCAAGTCCTCCTCCGCCGACTACTCTACCTACCAGCAGTACCCGCCCGCCTACCAGCAGGATCAGGTCTCTCATCACCCTTCCCCTCCCTCCCCCGGCCACCACACGTGTCCCCTCCCCCCGGTAGCCCTTGCCGCCTCCACCACCCCTCACTTCATCCCTCTGGGACGACTCACGCTTTGCCCGTCCATCTTCACGCTCTGCGCTACTTACACATCTGCTAGACGCGAGGGATTGTGGCGAGTGGGACTTTATGTCTGGTTGGCAATCCGGTCGTTACTAGACGGGGTAGGACACAGGGCAGTAGGGCGGCCGGCTAACGTGTCGCCACCGGGCAGGGCTCCTCGTACGGCGGCGGAGGGTCGCGCTACGCTGCCGGCGAGTACCACTCGGCCGCCGCTCAGCCGCCAGGTGTGCACCCTCACCCGCACCACGCGCCCAAACACTTCAACAACGAGTTTAACAAGGAGTGGCCAGCCGCCTAGTGCCCCCCGCCACCCCCGCGACCCCACGCACACGCACCAGCCGCCCGTCCACCGCCTCGCCCGCCGCGACCCCGCCATCTCTTACCTCAGTTAG

Protein sequence:

>DPOGS200032-PA
MRMNPDMDDDEKGKLFVGGLSWETSQENLQRYFSRYGDVIDCVVMKNSESGRSRGFGFVTFAEPSLVNVVLQNGPHQLDGRTIDPKPCNPRTLQKPKRGGGYPKVFLGGLPSNITETDLRVFFGRYGKVMEVVIMYDQEKKKSRGFGFLSFEDEISVERVTQEHFINLNGKQVEIKRAEPRDGSGKLGSGGGMGGGMGGAPGDAPAAGQWGPPQAAPMNIIQGHNGQMGGPPINMPMAGPNIMQGYQGWGTSAGQTSYAGYGAAGGGAGPGNYQGWGAPPAPQAPPHAPAWPATNNYTQHAQPPAQGYGSYANYSSAPAGASAGGSWTNWSMPQNSNSTGSGSYVPLSEGGEMYGRGGGGGTGAAPALAGALSSAALSKSSSADYSTYQQYPPAYQQDQVSHHPSPPSPGHHTCPLPPVALAASTTPHFIPLGRLTLCPSIFTLCATYTSARREGLWRVGLYVWLAIRSLLDGVGHRAVGRPANVSPPGRAPRTAAEGRATLPASTTRPPLSRQVCTLTRTTRPNTSTTSLTRSGQPPSAPRHPRDPTHTHQPPVHRLARRDPAISYLS-