Monarch geneset OGS2.0

DPOGS200341
TranscriptDPOGS200341-TA3498 bp
ProteinDPOGS200341-PA1165 aa
Genomic positionDPSCF300026 + 500569-508533
RNAseq coverage692x (Rank: top 19%)
Annotation
HeliconiusHMEL0000500.063.37% 
BombyxBGIBMGA005634-TA0.059.64% 
Drosophilastc-PB0.045.94% 
EBI UniRef50UniRef50_D0AB980.061.64%Putative shuttle craft n=42 Tax=Endopterygota RepID=D0AB98_9NEOP
NCBI RefSeqXP_970597.20.044.85%PREDICTED: similar to nuclear transcription factor, x-box binding 1 (nfx1) [Tribolium castaneum]
NCBI nr blastpgi|2613359600.061.64%putative shuttle craft [Heliconius melpomene]
NCBI nr blastxgi|2613359600.061.90%putative shuttle craft [Heliconius melpomene]
Group
Gene OntologyGO:00036762.2e-14nucleic acid binding
KEGG pathway 
InterPro domain[1030-1080] IPR0013742.2e-14Single-stranded nucleic acid binding R3H
Orthology groupMCL11979 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200341-TA
ATGTCTCAGTGGAATAATTCTTACTCCTACAATAATCAGTACCACACACCTAATAATTGGAATGGGGACTATAACAACCAATATCAGGCATACTATCCAAACGCTCAATATAATGCGAACCAATATGTAAGCTTCGATGAATTTTTATCCCAAATGCACATTTCCAACCCTCAAACAAACCCATATAATACTCAATATCCAAACTATCCCAATAGTCAGTATTCACAGTTACCTAATTATCAAAATGATTCACCCAACCAAAATTCTCAAGCTGTGTATAATTATGAAACTAGTTCAAGCAATTATAATTACAACAATGAAACATACCAGGGGAATACAGAAGAACAGATTCAGCAAACATCAATAGATCAACAATTACCAAGAGAAGTTGTGAAATCCAAACTCATGCCCACTGCCACTGAGTTTGTCCCTAAGCAATCTAGCACTAGTAATAAAGAACAACATTCAAGCAATACAAATAGAAATGCAGGTGATAGTAACAATTCCAAGCCATCAGGTTCAACAAACTGGAGAGAAAGACCGCAGAACTCAAAGAATTCTTTTACTTCAGAATCTAGCAACTTTTACCAAAAAAATGTGAGACCTCAAGAATTAAATAACCGCCATAACAAATATGATTCAAAATACCGCAATCAAGATAATCAAAATACCAATGGTGAAAGTAGTGGCCAAAATTCTGCAAACCCTGTGAACAAAGATCGTCCAAGTGATGCTAATAATCGCAAAGGCAAATCTAAGAGCAGACCCTTTGAAAACAACCAAAATTCGGAACCTAGTTTCCGCAACCAATATAATCAAGGCTACAATAACCAGTCAAAACCTAAAAACCGTACTTACAATGGCTGTAATCATGAATCTAACCACAACGAAGATATAAACGATATCAGTTCAGCATCCGAATTACCTGAAAATTCTAACAGTGACGAAGGGGGCCAATCAAAAAGTAATTCTAAGTTTAAAAGCAAAGACTCTGACCCAAGTCGGACTTTTTATAACAGTGGAATGCCAAAAGAAAGCCAAGATGTAAGAAATGGTAGAAGTGAAGGGTCAGGAAGGAATCGTAGGTGGATAGGAAGTCAAAGGTTAAAAGGTGCGGAAAGAGATATTTATGATGATGAACAGTATGCAAAGTCTTATTTCCATGCCAAAGAAGAAAGAAATAGGGATAATCTATCAAGTCCGGCCAAAGGGAAGAGTAAAAACTTGTCTAACCCGGGAGCTAACATAGATATGACACAACGTGAGCGCTTAAGCGACCAACTAGACAGGGGCACCCTTGAGTGTCTTGTATGCTGCGAGAGAGTTAAACAAACTGATCCAGTATGGTATTGCGGTAACTGTTATCATGTATTGCATCTCCGCTGTATAAGGAAATGGGCTATAAGTAGCATGATTGAAACAAAATGGCGATGTCCAGCATGTCAAAATGTGAATCAAGACATACCTCATCCATGCACACTATTGTGCCACCCTGGACCATGCCCTCCGTGCCAGGCCACTATAAGCAAGCAATGCGGCTGTGGGGCGGAGACGCGTTCAGTGTTATGTAGTAGTAAATTACCGCAAGTCTGTGGAAGAGTGTGTAATAAAAAATTAGAATGCGGGGTTCATTCATGCACTAAACAGTGCCATGAAGAACAGTGCGACCCCTGCGAGGAAATTGTCACACAAGTGTGTCACTGCCCCGCGGCCAAGTCTCGCTCTGTGGCGTGCACGTCACACACGGACCGCACCAGCTGGTCCTGCGGCGAGCAGTGCGGGCGTGTGTTGTCATGCGGAGCGCACGTGTGTCGCGCGACCTGCCACGCCCCGCCCTGCCCTGCGTGTCCCCTCACACCCGACAACGTACCGGCCTGTCCGTGTGGCAAGACTAGGATCAATAAAGATCAGCGCAAGACTTGCGTGGACCCCATACCGCTCTGTGGCAACATTTGTGCTAAGCCTCTACCGTGCGGGCCGGCGGGCGACAAGCATATATGTAAAGAGAGCTGTCACGAAGGTGGTTGCCGCGTCTGTCCCGACACAACTCTGCTGCAGTGCCGTTGCGGTCACTCGAGCCAAGAGGTGCCGTGTGCTGATCTGCCGCAGATGATCAACAACGTATTGTGTCAGAGAAAATGTAACAAGAAGTTGTCATGCGGCCGTCACCGCTGTCATACCCGGTGCTGTGACTCTGCCACTCATCGCTGCGCCGTCGTCTGCGGCCGTTCTCTCTCATGTCAGCTCCACAGATGTGAGGAGTTCTGTCACACTGGACACTGCGCGCCCTGTCCGAGAGTCAGTTTCGACGAGCTCCATTGTGAGTGTGGCGCGGAGGTGCTGATGCCGCCGGTCCGCTGCGGCACCAAACCACCCGCCTGCAGCGCCCCCTGCCGCAGGAGCAGACCCTGCGGTCACCCGCCGCACCACTCGTGCCACTCCGGCGATTGCCCGCCATGCGTCGTGCTTACAACTAAAATGTGTTACGGAAAGCATGAAGAACGGAAAACAATACCCTGTTCTCAAGAAGAATTCTCCTGCGGCCTGCCGTGCGGGAAACCTCTTCCTTGCGGTAAACACACTTGTATCAAAATATGTCACAAGGGATCTTGTGATATAAGCACATGTAGTCAACCCTGCACATCCAAACGGCCGAGTTGTGGTCACCCGTGTGCTGCGAGGTGTCACTCTAGCGGCGGGGGCTCCTGTCCCAGTCCGGCGCCCTGTCGCCGGCCAGTACGAGCCACCTGCCAGTGTGGACGAAAACAAACAGAGCGATCTTGCTGCGATAACGCCAGGGACTACGCCAAGATGATGAGTACCCTAGCGGCAACGAAAATGCAAGAGGGTGGCACTGTGGATATATCAGATGTTCAACGACCCGGATCAATGCTCAAAACATTGGAGTGTGACGAGGAGTGCTTCGTAGAGGCTCGGAGCCGTCGCCTGGCTTTGGCACTTCAGCTGCGAAATCCTGACGTATCGGCCAAGCTCGCGCCTCGATATAGCGATCATCTACGAACAACGGCCGCCAGGGAACCGACCTTCGCACAACAAATACACGACAAGCTGACGGAACTCGTCCAATTAGCCAAAAAGTCCAAACAGAAGACGAGAGCACATTCATTCCCATCTATGAACCGCCAAAAGCGTCAGTTCATCCACGAGCTGTGCGAACATTTCGGATGCGAAAGTGTTGCGTATGACGCTGAACCTAACAGAAACGTCGTTGCTACAGCCGACAAGGAAAAGTCTTGGCTGCCGGCTATGAGTGTACTAGAGGTGTTATCCCGGGAGGCTGGTAAGAGACGTGTACCCGGGCCGGTACTACGAGCGCCCGCCGCCGCCCTACCACCAACAAAGGAAATACCTTCCACCACCTCAAAGTCCTCATCGGGTGGTTGGGCAACGCTCACATCTACTAACGCGTGGGCGGCCCGCAGTCAGCCCAAGAAGGAAGAAACCAAAATTGACTATTTCGATAACCCTCCAGAGTAA

Protein sequence:

>DPOGS200341-PA
MSQWNNSYSYNNQYHTPNNWNGDYNNQYQAYYPNAQYNANQYVSFDEFLSQMHISNPQTNPYNTQYPNYPNSQYSQLPNYQNDSPNQNSQAVYNYETSSSNYNYNNETYQGNTEEQIQQTSIDQQLPREVVKSKLMPTATEFVPKQSSTSNKEQHSSNTNRNAGDSNNSKPSGSTNWRERPQNSKNSFTSESSNFYQKNVRPQELNNRHNKYDSKYRNQDNQNTNGESSGQNSANPVNKDRPSDANNRKGKSKSRPFENNQNSEPSFRNQYNQGYNNQSKPKNRTYNGCNHESNHNEDINDISSASELPENSNSDEGGQSKSNSKFKSKDSDPSRTFYNSGMPKESQDVRNGRSEGSGRNRRWIGSQRLKGAERDIYDDEQYAKSYFHAKEERNRDNLSSPAKGKSKNLSNPGANIDMTQRERLSDQLDRGTLECLVCCERVKQTDPVWYCGNCYHVLHLRCIRKWAISSMIETKWRCPACQNVNQDIPHPCTLLCHPGPCPPCQATISKQCGCGAETRSVLCSSKLPQVCGRVCNKKLECGVHSCTKQCHEEQCDPCEEIVTQVCHCPAAKSRSVACTSHTDRTSWSCGEQCGRVLSCGAHVCRATCHAPPCPACPLTPDNVPACPCGKTRINKDQRKTCVDPIPLCGNICAKPLPCGPAGDKHICKESCHEGGCRVCPDTTLLQCRCGHSSQEVPCADLPQMINNVLCQRKCNKKLSCGRHRCHTRCCDSATHRCAVVCGRSLSCQLHRCEEFCHTGHCAPCPRVSFDELHCECGAEVLMPPVRCGTKPPACSAPCRRSRPCGHPPHHSCHSGDCPPCVVLTTKMCYGKHEERKTIPCSQEEFSCGLPCGKPLPCGKHTCIKICHKGSCDISTCSQPCTSKRPSCGHPCAARCHSSGGGSCPSPAPCRRPVRATCQCGRKQTERSCCDNARDYAKMMSTLAATKMQEGGTVDISDVQRPGSMLKTLECDEECFVEARSRRLALALQLRNPDVSAKLAPRYSDHLRTTAAREPTFAQQIHDKLTELVQLAKKSKQKTRAHSFPSMNRQKRQFIHELCEHFGCESVAYDAEPNRNVVATADKEKSWLPAMSVLEVLSREAGKRRVPGPVLRAPAAALPPTKEIPSTTSKSSSGGWATLTSTNAWAARSQPKKEETKIDYFDNPPE-