Monarch geneset OGS2.0

DPOGS203885
TranscriptDPOGS203885-TA1113 bp
ProteinDPOGS203885-PA370 aa
Genomic positionDPSCF300402 + 86937-100540
RNAseq coverage191x (Rank: top 48%)
Annotation
HeliconiusHMEL0081312e-13260.48% 
BombyxBGIBMGA003834-TA5e-8143.97% 
Drosophilacrol-PE3e-2728.52% 
EBI UniRef50UniRef50_UPI000206267F2e-3134.28%UPI000206267F related cluster n=5 Tax=unknown RepID=UPI000206267F
NCBI RefSeqXP_001951603.14e-3233.95%PREDICTED: similar to zinc finger protein 35 [Acyrthosiphon pisum]
NCBI nr blastpgi|2608370315e-3436.10%hypothetical protein BRAFLDRAFT_208499 [Branchiostoma floridae]
NCBI nr blastxgi|2607945671e-4232.92%hypothetical protein BRAFLDRAFT_57705 [Branchiostoma floridae]
Group
Gene OntologyGO:00036761.3e-10nucleic acid binding
GO:00056341.9e-07nucleus
GO:00082701.9e-07zinc ion binding
GO:00056224.5e-05intracellular
KEGG pathway 
InterPro domain[277-303] IPR0130871.3e-10Zinc finger, C2H2-type/integrase, DNA-binding
[7-56] IPR0129341.9e-07Zinc finger, AD-type
Orthology groupMCL24943 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203885-TA
ATGTCGTGTGGAAATGAGAATTATTTTCAAAAAATAACAGAATGCTTGGATATAGATCTAACAAAATATGATCATCCTAACAAAGCATGCGACAGTTGCTTGGATCAAATTAACAAATTTCATGACTTTAAAAAATTTTGCCAAGAAACAGATAGGAGGTTGAGAGAAATTTTTGAAAATCAACACAACATCATCAAAAAAGTCGGAAGACAAAGCACTATAGTGGAGATCTTTGATTGTTTACAAACCGACAGCGAAAACGAAAAAAAAGAAATTAAAAAATCTTGGCGGTACAAACCGAAGCGAACACCTACGTATTGCAATATATGTAGAATAGATTTTAAAACTTTAGAAAAATTCAGCGAACACAGTTCTCAAGAGCACGGCATCGAAAGTGGGCTGTACAAATGTTTTGGTTGCGAGAAGAGGTTCAAAAATCGAAAAACGAGACTTGGCCATGAGCTGAAAATTTGTAAAAATCTTAAAAATGGGTATAGATGTGGCATTTGTAATAGATATCTCCCGAGACGAGGCTTGTACGAGACACATATGAGAGACCACAGAGGGAATGTACCAATGAAGCTTCCGAATGAGCTATTCAAGTGCAGAAAGTGTGACAAAGTGTTTGACACAAACGACAATCTCTCGAGACATGTCTCCGAACATGACTTGAATGAGGACAATTATATATGTGAGAAATGTGGTCGCGTATTCACAAGGAAGGACTACCTGCACAAGCACAAACTAACGCACACAGGCGAAAAACAGCACACATGTCCGCACTGCGACTTCCGGACGATACAGAGGTCGTCGCTGATTGTTCATATAAGGAAGCACACCGGCGAACGTCCCTACAAATGTAGCGTGTGTCCGCAACGGTGCATCTCCAGTTCAAACCTGAGAGCACATCAGCAAAGACACTTGGGTCTCAAAGTTCATGAGTGTACAATCTGCAATAAAAAATTCGGTTATAAAATAAGTTTAAAAGAGCACATGTCGACGCATGCTCCGTCGAGTTACTCTTGCGATCAGTGCAGCTCGACTTACTCGAGATTGAGAGGGTTAAGGCGACATGTGCTGACGAAACATGGAACCAGAAAGGAGGGACTATGA

Protein sequence:

>DPOGS203885-PA
MSCGNENYFQKITECLDIDLTKYDHPNKACDSCLDQINKFHDFKKFCQETDRRLREIFENQHNIIKKVGRQSTIVEIFDCLQTDSENEKKEIKKSWRYKPKRTPTYCNICRIDFKTLEKFSEHSSQEHGIESGLYKCFGCEKRFKNRKTRLGHELKICKNLKNGYRCGICNRYLPRRGLYETHMRDHRGNVPMKLPNELFKCRKCDKVFDTNDNLSRHVSEHDLNEDNYICEKCGRVFTRKDYLHKHKLTHTGEKQHTCPHCDFRTIQRSSLIVHIRKHTGERPYKCSVCPQRCISSSNLRAHQQRHLGLKVHECTICNKKFGYKISLKEHMSTHAPSSYSCDQCSSTYSRLRGLRRHVLTKHGTRKEGL-