Monarch geneset OGS2.0

DPOGS214614
TranscriptDPOGS214614-TA2916 bp
ProteinDPOGS214614-PA971 aa
Genomic positionDPSCF300050 + 10697-21844
RNAseq coverage71x (Rank: top 66%)
Annotation
HeliconiusHMEL0107322e-10478.65% 
BombyxBGIBMGA001777-TA2e-7237.02% 
Drosophilacrol-PE3e-3326.80% 
EBI UniRef50UniRef50_UPI00020F6A304e-4929.81%UPI00020F6A30 related cluster n=1 Tax=unknown RepID=UPI00020F6A30
NCBI RefSeqXP_001946669.17e-4830.35%PREDICTED: similar to Zinc finger protein 271 (Zinc finger protein 7) (HZF7) (Zinc finger protein ZNFphex133) (Epstein-Barr virus-induced zinc finger protein) (ZNF-EB) (CT-ZFP48) (Zinc finger protein dp) (ZNF-dp), partial [Acyrthosiphon pisum]
NCBI nr blastpgi|3584200522e-5030.76%PREDICTED: uncharacterized protein LOC516002 [Bos taurus]
NCBI nr blastxgi|2607896319e-6329.21%hypothetical protein BRAFLDRAFT_61483 [Branchiostoma floridae]
Group
Gene OntologyGO:00036768.2e-12nucleic acid binding
GO:00082702.3e-05zinc ion binding
GO:00056222.3e-05intracellular
KEGG pathway 
InterPro domain[846-871] IPR0130878.2e-12Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL26806 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214614-TA
ATGGATTCTCATTTAATGCATGAGGCGGTCCCTCACAACATAAAGATGGATATGCACGACGGTTCGCACAACGTTAAACTTGACCTCCACGAAGTTCCTATATCACACAACATCAAAATGGACGGTCACGTTCACGAGGTGCCAGTCACACACATAGGTCATATGAACCAAAACAATCAGTTACAACAGCAAAACGTCACACAAAACGTGGAGACGGAACCGGAGAATCTGCTGAAGCCGCGAATGGAGAAGAAAAAGAAAGAAGCCATCGACGGACAAGCTAGAGAAATAATTTACAAAGTTATCAAGTTCTTCGAGAGTGAAAAACAGAATAGAGGTTACGCGTTCCCGGTTGAAAATGTAGTTAAAAGGGCCTGTGCGGCCACCGGCCTGTCTGAAAGCACTATAAAGAGGATTAAACGGGAAGGTATCAGAGCTGAGGCGACACACACGAAAATGGCTGGCCCAAAGAAGAAGAGAGTCCGGAAAACTAAAGTCCAACTGGATTATTTCCAACTGTGTGCGCTACGAGGCATTGTTAACAGCTACAGTATGAGAAAAGAGGTACCTACCTTAGGCAAAATACTAACAGCGGCCAAACACGAACTGAATTACAGAGGTGGCAAGGAATCTCTCAGGCTAATATTACTGAACAAGCTAGGTATTAAGTTTAAGAAGTGCGAAAAGAAAAATAAAAAACCTCCGGAGCCCAATCAGGGGGCGGCAATTCAGCCGATGCCAAATGTTATGTCACATTTGCCAATGCAGCAAATGAAACCAGAGAACCAATGTATTTATAGTAATATGATGCCCCAGACACCAGGAAAAACGTTATCACAATTGTCCATAAGCAAAAATATAGATTATGACCTGACATGTACCGAGAATGACAATGATAGCAAAATAAAAATAGAATTAAACGATTCGGGCAATGATTACAATGAAACGATAGTTAAGGAGGAATTAGATGATACGTTAAATTATAACGGAATAGGTAATATTGATATCGGGTTGAAGAAAATCAGGAGTGTAAGGAGGAAGTGCAAGATAAGTGATACTAGATTCACTGATGATGAGAAGGAGAGTCTGGTTGTGCCGTTTTTGATCAAACTCAATGATGTTTTTGCATTACCCAAGAAGAGGAACGGCGTCCATTACAAGAATATGTTGGCGATATGTAGTAAATATATTGGTAATTTGGCCGAAGTTAGGAGAAATGATCTTGAAAGAGTTGGGCCAAATCTCGAATGCACGGACTGTGGTGCACAGATCACAGACCGGGATATGATGGCGCACTGGGACGAGCACAGAGTACACCTGCACAGATGCAACCTCTGTGACGTCATCTCCAGGTCAAGGAAGGAGATCATACAGCACATAACAGAAGTGCACACCAAGGTGTACACCTGCAAGGAATGCGGGATTAAATGCTGGAAATTGCAGGAGTTCAATAAACATTACCGGAACTTTCACAAATACTTTGTGTGTGATCATTGTGATAAAAAGTTCTATAGCAAGTCTGTCATTGAGAGGCATATCAGATGTCGTCACCTCCGCCCCCCGCCGCCGGAGCCCCCGGAGCAAGCGTACTGTGTGGAGTGCGACCGAGTGTTCCCAAGCCAGCAGATGTACAGGAGACACCTGAGGACTGCCGCCGCTCATCGACCGCCTAAAAACACCAAGGTGCCGTGCCCGGACTGCGGGAAGACGTTCAGCAGGAAGGTGTACATGAACAATCACCACAAGCAAGTCCACAGGCGGGACTCGCCGCACTACTGCAGGGACTGTGACAAGTACTTCATCAACGGCTACGCTATCCGCACCCACGTTAAATTCGTCCACGAGAAGAGCGAGAAACCCAAGAACAAGATCTGCGACATCTGCCAGCGCGGCTTCCACACCAATCGCGTGCTGTCAAACCACCGCCGCACTCACACTGGCGAGAGGCCGTACTGCTGTGAACACTGTGGGGCGGCGTTCGCTCAGCTACAGGCTAGGAAGACTCACGAGAGGACGCAGCACAGGGCGGCACAGATGAGGCCCAAGCAGTCAATGCAATTTACCGTGAAACATCTATCATTGGATGAACAGAGAGAGGAGATTGAATTGAAGAAGGGGTCGCCGGAATATACACAAAGGAATTATAAATGCAAGGACTGTGGTCTGGGTTTTATTAGTGAAGATGTCTTGGAAGAGCACATCGTGAAGCATTCAGAGTCTAATGGCGCGAATATGTGCGATGTATGCACATTACGATTCAAATCAAAGACGGTATTGACACAACACAAACTACTACACTCCCGCGTGTTTGTCTGTAATAAATGTGGTGTTCATATAAAAAAGTGGTCTCACGCTCTCACACACAGACACAAGTGTTGGGACGTTTGTGTATCTGTGTGCGTGTATTGTAAGAAAGTGTTCAACAATAAAAACTCCTTGGATGTTCATATAAGAGGGGTTCATAAGAATATTAAGAAATACGTTTGTGTCGAATGCAAACGACTTTTCGGTACAAAACAACGTCTGCGGGTTCATATGAGATCACACACTGGCTCTAAACCTTTCGTGTGTGACTGCGGTCGTAAATTCACGACTAAATCGAATCTAAAGTCCCATCAAAACGTGCACAGCAGTTCCAGAGAACATTACTGCGTTGAGTGCAACAGATATTACAAGACTGAGAGGGGCTTGAAGAAACATTATAAGGACACGTTGAAGCACGGGGGATATGGCGCGATTCCTCGCTCTCAGTGTGATGATAAATTTCATTCGGAGACCGCTGTCAGCTCGCATGTTCGAGTCAGACACTCCACGGAGTATACCTGCGGTGTGTGTAACAAGAATTATTCAAGCAACTCTAATCTTAGGAAACATCTCCGCAGCGTGCACAACCTGACAGATATAGATATTGCATAA

Protein sequence:

>DPOGS214614-PA
MDSHLMHEAVPHNIKMDMHDGSHNVKLDLHEVPISHNIKMDGHVHEVPVTHIGHMNQNNQLQQQNVTQNVETEPENLLKPRMEKKKKEAIDGQAREIIYKVIKFFESEKQNRGYAFPVENVVKRACAATGLSESTIKRIKREGIRAEATHTKMAGPKKKRVRKTKVQLDYFQLCALRGIVNSYSMRKEVPTLGKILTAAKHELNYRGGKESLRLILLNKLGIKFKKCEKKNKKPPEPNQGAAIQPMPNVMSHLPMQQMKPENQCIYSNMMPQTPGKTLSQLSISKNIDYDLTCTENDNDSKIKIELNDSGNDYNETIVKEELDDTLNYNGIGNIDIGLKKIRSVRRKCKISDTRFTDDEKESLVVPFLIKLNDVFALPKKRNGVHYKNMLAICSKYIGNLAEVRRNDLERVGPNLECTDCGAQITDRDMMAHWDEHRVHLHRCNLCDVISRSRKEIIQHITEVHTKVYTCKECGIKCWKLQEFNKHYRNFHKYFVCDHCDKKFYSKSVIERHIRCRHLRPPPPEPPEQAYCVECDRVFPSQQMYRRHLRTAAAHRPPKNTKVPCPDCGKTFSRKVYMNNHHKQVHRRDSPHYCRDCDKYFINGYAIRTHVKFVHEKSEKPKNKICDICQRGFHTNRVLSNHRRTHTGERPYCCEHCGAAFAQLQARKTHERTQHRAAQMRPKQSMQFTVKHLSLDEQREEIELKKGSPEYTQRNYKCKDCGLGFISEDVLEEHIVKHSESNGANMCDVCTLRFKSKTVLTQHKLLHSRVFVCNKCGVHIKKWSHALTHRHKCWDVCVSVCVYCKKVFNNKNSLDVHIRGVHKNIKKYVCVECKRLFGTKQRLRVHMRSHTGSKPFVCDCGRKFTTKSNLKSHQNVHSSSREHYCVECNRYYKTERGLKKHYKDTLKHGGYGAIPRSQCDDKFHSETAVSSHVRVRHSTEYTCGVCNKNYSSNSNLRKHLRSVHNLTDIDIA-