Monarch geneset OGS2.0

DPOGS214415
TranscriptDPOGS214415-TA3309 bp
ProteinDPOGS214415-PA1102 aa
Genomic positionDPSCF300069 + 183268-190982
RNAseq coverage59x (Rank: top 68%)
Annotation
HeliconiusHMEL0106142e-15771.69% 
BombyxBGIBMGA011248-TA0.052.50% 
Drosophilacrol-PE5e-2825.14% 
EBI UniRef50UniRef50_C3XX028e-6731.91%Putative uncharacterized protein n=14 Tax=Chordata RepID=C3XX02_BRAFL
NCBI RefSeqXP_001945749.12e-5528.76%PREDICTED: similar to mCG7830 [Acyrthosiphon pisum]
NCBI nr blastpgi|2608326126e-6730.31%hypothetical protein BRAFLDRAFT_261844 [Branchiostoma floridae]
NCBI nr blastxgi|2608111933e-7829.70%hypothetical protein BRAFLDRAFT_66809 [Branchiostoma floridae]
Group
Gene OntologyGO:00036764e-08nucleic acid binding
KEGG pathway 
InterPro domain[537-565] IPR0130874e-08Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL25825 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214415-TA
ATGGATTTTGTTAAGAGAGAGGATAATCATGTGTTGGAGGAAGAAAATTATAAATTATACATGCAAAACCTACAAAAATCACATCAAATCGATACAACGGAATATCAGCATACTGCAATATCAATGGAACAACTCCACCAAGCGGTGCAGTCCACACTCAACAATGGATCATTACACAACTTCCAATTACCGATATCCCCTCGACAAATGTCCACACTCATGCTGAAGACCAACCACCTCACACGATCACTCACTTATAATGATCTCCCGCAACATCTCTCAAAGAATCTGGCCATGGACATGGAATTGTTACCGATGACGAATGAAATTCAGCAAATCCGCCACGATGATATACTAAGCCAGAATCTGTCGAGAAATCTCGATATAACGCTGGCACGCAATCTTAACAATGAACTCGAATTACAGAACAGCTTGGCTCAAATACAAAGTTTACATGAACATGAAATGTCAAGGAATGCGGAGATCACTCAGAATTTAAGTCGAGTCAACGAATTAACACAGAACATGGGTCGGGAACTGTCACATGAGCTGATTAACGAAATCGACTTATCACAGCTGAGTAGACACGGATTAGATCAGAATATTTTAAGTCACGATGAAGGGAGACGAAGCCCAATAATTCATTCTGTTGATAATCATTTATTGGAGCACCACATTGCTCAAAGATTGGAACAGAATATAGCCATGCGATTGGATTCAGACAGATTGGATCAAACGCTATCCCAGAGGTTGGATCAAACATTAGCGCAAAGGTTGGATCAGACATTGGCCCAACGATTGGATCAAAGACTATTGAATCCTGGTGTCATACATGATCACAGGGTATTGGAGCAAAATGAGAACTTACTACCCATGCCTTTCCACATAAAATCCGAACAAGATGACGATAGCTATTTCTATGACAACCTGAATCAAGGAATCAATACAGCTTCCGTCAACAATGAGTTACAACAAAACGATCACAGTCAGACGAATTCAATACAACATGAACAAATGTATTCGTTATTCAATAACACATCAATACCGGCTCTGAACACCATAGACTTATATTCCCGGACCCCATACATACAGAACTATCCAGAAATTGCGCGGGAAAATCCACAGAATCTAGTCGTTCATAGACAATACGACAATAACAGTCCATACGCTGACGAAACAAAGAAAAAGCAAGAGGATGTCAAAAAGAATACCAAAATCGAACATCCCAAAGAGAATTTGAAACCTAACGATCAAAATAAACTATACTACGATTACAACGATTACGTGAACGTCGAAAAAAATGAGAACGATTCTAGTTCGCATAACAAAATAACAGAGGAACTCGCGTTGAATATAAAAGGTGAGTACGCGTGTTACAAGTGCAACGAAGTGTTCCCATCGAAGAGGTTATTGAAACAACATTCTAAGAATTGTGAAAGTGCTGATAGTGATTTAGATAAATTGGGTAAATTCAGTTGTTCACAGTGCGCGTATAGATGTCAATCTCCCGCCATTTTGAAAATACACGAAAGAACACATACAGGCGAGAAACCGTACGCGTGCACGTTCTGCGATTATAAATCAGGTCAAAAGAATAACGTGGCCAAGCACATACTAGTGCACATGAAACAGAAGCCTTTCAGCTGTCAGTATTGTGATTATAAATGCGCTCAAAAGAATAATTTAGTTGTCCACGAGAGGACTCACACGGGTTACAAGCCATTCGCATGCCCCTACTGCGATTACAGGACGGTTCAGAAGCCTAATTTAGTCAAACATATGTATTTGCACACCGACCAGAAGCCATTCAGCTGTGATATGTGTAATTATAGGTGCGTTCAAAAGACGAACCTTACGAAACACAAGCAACGTCATCTGACCGAATGCGACAAAATGGATATCAAAAATCAAGTGAAGCCCTACAAGCCTAGACAGAAATCGGTCAAATGCGCCCATTGTTCGTACAGGGTATTGGAGCAAAATGAGAACTTACTACCCATGCCTTTCCACATAAAATCCGAACAAGATGACGATAGCTATTTCTATGACAACCTGAATCAAGGAATCAATACAGCTTCCGTCAACAATGAGTTACAACAAAACGATCACAGTCAGACGAATTCAATACAACATGAACAAATGTATTCGTTATTCAATAACACATCAATACCGGCTCTGAACACCATAGACTTATATTCCCGGACCCCATACATACAGAACTATCCAGAAATTGCGCGGGAAAATCCACAGAATCTAGTCGTTCATAGACAATACGACAATAACAGTCCATACGCTGACGAAACAAAGAAAAAGCAAGAGGATGTCAAAAAGAATACCAAAATCGAACATCCCAAAGAGAATTTGAAACCTAACGATCAAAATAAACTATACTACGATTACAACGATTACGTGAACGTCGAAAAAAATGAGAACGATTCTAGTTCGCATAACAAAATAACAGAGGAACTCGCGTTGAATATAAAAGGTGAGTACGCGTGTTACAAGTGCAACGAAGTGTTCCCATCGAAGAGGTTATTAAAACAACATTCTAAGAACTGTGAAAGTGCTGATAGTGATTTAGATAAATTGGGTAAATTCAGTTGTTCACAGTGCGCGTATAGATGTCAATCTCCCGCCATTTTGAAAATACACGAAAGAACACATACAGGCGAGAAACCGTACGCGTGCACGTTCTGCGATTATAAATCAGGTCAAAAGAATAACGTGGCCAAGCACATACTAGTGCACATGAAACAGAAGCCTTTCAGCTGTCAGTATTGTGATTATAAATGCGCTCAAAAGAATAATTTAGTTGTCCACGAGAGGACTCACACGGGTTACAAGCCATTCGCATGCCCCTACTGCGATTACAGGACGGTTCAGAAGCCTAATTTAGTCAAACATATGTATTTGCACACCGACCAGAAGCCATTCAGCTGTGATATGTGTAATTATAGGTGCGTTCAAAAGACGAACCTTACGAAACACAAGCAACGTCATCTGACCGAATGCGACAAAATGGATATCAAAAATCAAGTGAAGCCCTACAAGCCTAGACAGAAATCGGTCAAATGCGCCCATTGTTCGTACAGGTGTGTACAGAAATCTAGTTTAGATAAACATATGCAATTCAAACATAGTGACATACAAACGGATATGCAATTCAAACAAAGTGATTTACGTACTGATTTGCAATTCAAACAAAGCGAATTGCAAAGTGACTTGAGTGATGGTGTTAATGGAACTAGTGACTTTGACAGTATACAGAATTTGAGTATAAAAGACATGTCTCAGGAGATCTGTACTTGA

Protein sequence:

>DPOGS214415-PA
MDFVKREDNHVLEEENYKLYMQNLQKSHQIDTTEYQHTAISMEQLHQAVQSTLNNGSLHNFQLPISPRQMSTLMLKTNHLTRSLTYNDLPQHLSKNLAMDMELLPMTNEIQQIRHDDILSQNLSRNLDITLARNLNNELELQNSLAQIQSLHEHEMSRNAEITQNLSRVNELTQNMGRELSHELINEIDLSQLSRHGLDQNILSHDEGRRSPIIHSVDNHLLEHHIAQRLEQNIAMRLDSDRLDQTLSQRLDQTLAQRLDQTLAQRLDQRLLNPGVIHDHRVLEQNENLLPMPFHIKSEQDDDSYFYDNLNQGINTASVNNELQQNDHSQTNSIQHEQMYSLFNNTSIPALNTIDLYSRTPYIQNYPEIARENPQNLVVHRQYDNNSPYADETKKKQEDVKKNTKIEHPKENLKPNDQNKLYYDYNDYVNVEKNENDSSSHNKITEELALNIKGEYACYKCNEVFPSKRLLKQHSKNCESADSDLDKLGKFSCSQCAYRCQSPAILKIHERTHTGEKPYACTFCDYKSGQKNNVAKHILVHMKQKPFSCQYCDYKCAQKNNLVVHERTHTGYKPFACPYCDYRTVQKPNLVKHMYLHTDQKPFSCDMCNYRCVQKTNLTKHKQRHLTECDKMDIKNQVKPYKPRQKSVKCAHCSYRVLEQNENLLPMPFHIKSEQDDDSYFYDNLNQGINTASVNNELQQNDHSQTNSIQHEQMYSLFNNTSIPALNTIDLYSRTPYIQNYPEIARENPQNLVVHRQYDNNSPYADETKKKQEDVKKNTKIEHPKENLKPNDQNKLYYDYNDYVNVEKNENDSSSHNKITEELALNIKGEYACYKCNEVFPSKRLLKQHSKNCESADSDLDKLGKFSCSQCAYRCQSPAILKIHERTHTGEKPYACTFCDYKSGQKNNVAKHILVHMKQKPFSCQYCDYKCAQKNNLVVHERTHTGYKPFACPYCDYRTVQKPNLVKHMYLHTDQKPFSCDMCNYRCVQKTNLTKHKQRHLTECDKMDIKNQVKPYKPRQKSVKCAHCSYRCVQKSSLDKHMQFKHSDIQTDMQFKQSDLRTDLQFKQSELQSDLSDGVNGTSDFDSIQNLSIKDMSQEICT-