Monarch geneset OGS2.0

DPOGS210903
TranscriptDPOGS210903-TA3285 bp
ProteinDPOGS210903-PA1094 aa
Genomic positionDPSCF300045 - 235753-245135
RNAseq coverage152x (Rank: top 53%)
Annotation
HeliconiusHMEL0158248e-13155.53% 
BombyxBGIBMGA003087-TA4e-13151.31% 
DrosophilaCG6654-PA4e-4034.33% 
EBI UniRef50UniRef50_UPI0002064A9A2e-8844.91%UPI0002064A9A related cluster n=4 Tax=unknown RepID=UPI0002064A9A
NCBI RefSeqXP_395427.21e-8844.91%PREDICTED: similar to zinc finger protein 585A, partial [Apis mellifera]
NCBI nr blastpgi|3838581422e-9043.86%PREDICTED: uncharacterized protein LOC100874963 [Megachile rotundata]
NCBI nr blastxgi|3838581424e-9730.67%PREDICTED: uncharacterized protein LOC100874963 [Megachile rotundata]
Group
Gene OntologyGO:00036766.1e-15nucleic acid binding
GO:00056346.9e-12nucleus
GO:00082706.9e-12zinc ion binding
GO:00056222.7e-05intracellular
KEGG pathway 
InterPro domain[767-795] IPR0130876.1e-15Zinc finger, C2H2-type/integrase, DNA-binding
[10-83] IPR0129346.9e-12Zinc finger, AD-type
Orthology groupMCL17984 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210903-TA
ATGGAAGTGTTTCTTTATAACTCTACAGTTTGTAGATTATGTGGCGAAGAAAATGATAATGGAACATTACTATATTCATGTGAAGAAAATAATCAAAGCTTATGTGAAATAATTAATACCTATTTGCCAATAAAGGTATCTGATGATGGAGAACTACCACGGACTATTTGCCCTGGATGTACAATTCAATTGGAAGCAACAGTTGAATTTTTAAATCTAATTATAAATGGTCAAAAAATTTTGCGTGAACTTTACCAACGAGAGAAGGAATACAAAAAGACTGTTCTTAATAATTCCAATAAAGGAACTCCGGAAGTTATATCAGAAAAAATCATTTACGAAATAAATACAAGCAATGGGGTGTATCAAGTTGAGCATCCAATATCACTGCAGGTCAGCGGGCTTGATAAACCAAAGAGAAAAAGAGGCCGTCCACCAAAGAAACAGAAGACTGCCGAGGAGATCGCCCAGGAAACTCCCAAAACAGTGGAAATTGAGGATAAGACGGAGAAAGATGATGACGAACGTTCAGGGAAGAGGAGGAGAAAAACACCTACCAGGTTCAAGGAAGCCGTTCAGGGCAAGGAGCTGGAAAGAATATTCATTGAAGAAGGCGTCATAGATGGCAATGAGAGCGACCACAACACAAAGGCTGATACGACACAGGAAAATAAATTACCGGTGAACAAGGAACCACAAGTTATAGGGCATTTGGAGGCGTCCGGAGAGCTTGTTGTGGTGGTGAAGGGCAAGGGAAGGGGTAGACCTAAAGGTCGCACGCGTCAAACCCGCGAGGAATGCGCCATATGTGGGCTTGAGTTTGCTGCGACTGGTCGCTACATGTCCCACATCGCTCAGCATGGACCTGTTCTTTACAAGTGTGACTGCGGTCAAACATTCACTACTAAGCTACTGTTCTCCGAACATCAGAACACAAGCGGTCACAGCGGGCGGACGGTGGTGCCCTGTAGAAACGAAGTCGAGTCTCAGAAAGAGTCCGAAAAGAATGAAACGCCTTTGATCGAATTGATACCCGAGGCCGTAGAGGATGTTGTCAAAGGAGATATACAAATACCTCAAGCATTACCTGATTTGAGTGATCTCGACCCGCTGAAGTGTGATGACCATGTCAAGACTGAGACGGTGAAAAACGAACAAGAGAGAGAGGAGAATGACCCTCTGCAAGATGAGTGCGAGACAGCTGACGGAACTCGTGAGGAAGTACAGGACAGCAAGAAGGAGAAGGTCAAGATTAAGTGCAACCACTGCGATAAACTGTTCGGCACCCGGCAGAGCAAGTCGCTGCACATAAAGCAGCATCGTGACTCAAGTCGCACGCGTCAAACCCGCGAGGAATGCGCTATATGTGGGCTGGAGTTTGCTGCGACGGGTCGCTACATGTCCCACATCGCTCAGCACGGACCTGTTCTTTACAAGTGTGACTGCGGTCAAACATTCACCACTAAGCTACTGTTCTCCGAACATCAGAACACAAGCGGTCACAGCGGGCGGACCGTGGTGCCCTGTAGAAACGAAGTCGAGTCTCAGAAAGAGTCCGAAAAGAATGAAACGCCTTTGATCGAATTGATACCCGAGGCCGTAGAGGATGTTGTCAAAGGAGATATACAAATACCTCAAGCATTACCTGATTTGAGTGATCTCGACCCGCTGAAGTGTGATGACCATGTCAAGACTGAGACGGTGAAAAACGAACAAGAGAGAGAGGAGAATGACCCTCTGCAAGATGAGTGCGAGACAGCTGACGGAACTCGTGAGGAAGTACAGGACAGCAAGAAGGAGAAGGTCAAGATTAAGTGCAACCACTGCGATAAACTGTTCGGCACCCGGCAGAGCAAGTCGCTGCACATAAAGGCGGTACATCTCGGCGAGAAGTCGTACGTGTGCCCGGAGTGCGGCGCGCGGTTTGCGTACCCCCGCTCGCTGGCCGTACACCGACAAGCTCACCGCAGGGCGAGGCCCTCCGCGGGCTACGCCTGCGATCTCTGCGGGAAGGTGTTGAACCACCCGTCGTCGGTGGTGTATCACAAGCAGGCGGAGCACGCGGACCAGCGCTACGTGTGCGGCGCGTGCGGCAAACAGTTCCGACACAAGCAACTGCTGCAACGACACCAGCTGGTACACTCGCAGGCCAGGCCCTTCTCGTGTAAGGTGTGTAACGCCACGTTCAAGACGAAAGCCAATCTTCTCAACCACCAGCTGCTGCACTCCGGCGTTAAGAAATTCTCGTGCGAAATTTGCAAACATAAATTCGCACACAAGACCAGCCTCACGCTGCACATGAGATGGCACACAGGGGTCAAACCGTTTACTTGTGGCGTGTGCGGTAAGAGCTTCAGTCAGAAAGGGAACCTCTCGGAACACGAACGCATCCACACTGGAGAGAAGCCGTATCAGTGTGCGCTGTGTCCTCGAAGATTCACAACCTCGTCCCAGCACCGCCTGCACGCCAGGAGACACGCCGAACGAACACACTGCTGTGGAAAATGCGGGAAGCGCATGTCGTCCCGCAGCGTGTGGGCGGCGCACGTCCGGCGCGATGACTGCACGACGCGGCGGTTGGCGCGACAAAAGGTCACAAAACAAATAAGTTTATTGGTAAACGACAAGAACCATCAGCCGGTGCAGCTGGAAGATCCCAAGCTGTCCGACGACAACACCGAGGAGAGGGTCATATACGTGGCCTACGACACCGAAGACTCCGAGTCCACCGCCTTCCATATATTAGACCCAGAACAGGTGCAGACTGCTGATATAGAACAGAACAAAGTACTGACGACCTGCGAGCTTTATACACGACCGTCGCTGCTGGTGTCGCAACAACTACAGCAGTTACAGCTGGAGACGGCGGAACAGCAGGTGGTGGAACACGAGCAGCTGGAAATAGACGAACACCTGGAGCTGGAACACGAGGAACTCGGCCTGGACGACGAGCAAATTAAGATCGAGAACCAGATGGAGATTGAAGAAATTGAGGAAATAGAAACGAGTCCTGTAGTGGTCGGCGGGCAGAGCATACCCGTGACGGACGAGCGCGGTAACCCACTACACTTCACCATGGCTGACGGAACCAAGCTGGCTATCACCTCCGTGGACGGCAAGTCGCTGCAGGTGATAACACAAGACGGCCAGACGATACCGGTGGAGATCAACGGATACGACAACCAAGACCAGGTGCCGCCGAGCCCCAACGCGGTGGTTCACCAGCTCCACCTGCAGAAGACTCCGCCGCCCGCTCCCGTCACTCACTACTTCACTATCGTCTGA

Protein sequence:

>DPOGS210903-PA
MEVFLYNSTVCRLCGEENDNGTLLYSCEENNQSLCEIINTYLPIKVSDDGELPRTICPGCTIQLEATVEFLNLIINGQKILRELYQREKEYKKTVLNNSNKGTPEVISEKIIYEINTSNGVYQVEHPISLQVSGLDKPKRKRGRPPKKQKTAEEIAQETPKTVEIEDKTEKDDDERSGKRRRKTPTRFKEAVQGKELERIFIEEGVIDGNESDHNTKADTTQENKLPVNKEPQVIGHLEASGELVVVVKGKGRGRPKGRTRQTREECAICGLEFAATGRYMSHIAQHGPVLYKCDCGQTFTTKLLFSEHQNTSGHSGRTVVPCRNEVESQKESEKNETPLIELIPEAVEDVVKGDIQIPQALPDLSDLDPLKCDDHVKTETVKNEQEREENDPLQDECETADGTREEVQDSKKEKVKIKCNHCDKLFGTRQSKSLHIKQHRDSSRTRQTREECAICGLEFAATGRYMSHIAQHGPVLYKCDCGQTFTTKLLFSEHQNTSGHSGRTVVPCRNEVESQKESEKNETPLIELIPEAVEDVVKGDIQIPQALPDLSDLDPLKCDDHVKTETVKNEQEREENDPLQDECETADGTREEVQDSKKEKVKIKCNHCDKLFGTRQSKSLHIKAVHLGEKSYVCPECGARFAYPRSLAVHRQAHRRARPSAGYACDLCGKVLNHPSSVVYHKQAEHADQRYVCGACGKQFRHKQLLQRHQLVHSQARPFSCKVCNATFKTKANLLNHQLLHSGVKKFSCEICKHKFAHKTSLTLHMRWHTGVKPFTCGVCGKSFSQKGNLSEHERIHTGEKPYQCALCPRRFTTSSQHRLHARRHAERTHCCGKCGKRMSSRSVWAAHVRRDDCTTRRLARQKVTKQISLLVNDKNHQPVQLEDPKLSDDNTEERVIYVAYDTEDSESTAFHILDPEQVQTADIEQNKVLTTCELYTRPSLLVSQQLQQLQLETAEQQVVEHEQLEIDEHLELEHEELGLDDEQIKIENQMEIEEIEEIETSPVVVGGQSIPVTDERGNPLHFTMADGTKLAITSVDGKSLQVITQDGQTIPVEINGYDNQDQVPPSPNAVVHQLHLQKTPPPAPVTHYFTIV-