Monarch geneset OGS2.0

DPOGS206164
TranscriptDPOGS206164-TA5091 bp
ProteinDPOGS206164-PA1696 aa
Genomic positionDPSCF300523 - 18193-38648
RNAseq coverage262x (Rank: top 41%)
Annotation
HeliconiusHMEL0118230.056.32% 
BombyxBGIBMGA009297-TA0.065.04% 
DrosophilaBap170-PA5e-8431.73% 
EBI UniRef50UniRef50_UPI00022CA1A61e-11635.95%UPI00022CA1A6 related cluster n=4 Tax=unknown RepID=UPI00022CA1A6
NCBI RefSeqXP_001653162.17e-11433.30%Brahma associated protein 170kD, putative [Aedes aegypti]
NCBI nr blastpgi|3287916071e-11735.82%PREDICTED: hypothetical protein LOC724311 [Apis mellifera]
NCBI nr blastxgi|1892363437e-13030.65%PREDICTED: similar to AGAP006990-PA [Tribolium castaneum]
Group
Gene OntologyGO:00036772.7e-23DNA binding
GO:00056222.7e-23intracellular
GO:00054884.9e-06binding
KEGG pathway 
InterPro domain[8-104] IPR0016062.7e-23ARID/BRIGHT DNA-binding domain
[270-477] IPR0160244.9e-06Armadillo-type fold
Orthology groupMCL13874 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206164-TA
ATGGCAAAATCTCAAATAAATACGAAATCTAGAAATTATGTTCAAGACAAAGAAGCATTTTTGAAAGAACTTAAGCAGTTTAATGAAAGCAAAAATATCCCTTATAAAATACCGGTTGTTAATGGTGTAGATATAGATTTGTATCTCTTGTATTCATTAGTTCAACAAAGAGGAGGTCTTAGCAAAGTTAATCAAAATGATACCTGGGAGACATTTTTGCGTCAGCTCCACTTGCCACATCCATGTGTTAATGGGTCTACATTATTGAGGAGAATATATGGAATGTATTTGGAGAAATATGAAAGAGCTAAAGGTCCACCAGGTAGAGATGATGACTTAGATATGGACGACGACCCTCGTCGTGGCAGGGGCGGAGGAATGCCACGCATATCATTTGCTTCGGGTGAACCGCTTCGTACAGGAAATCGTGTGGCTGGTCCATCTGAACGCCTTACATTATCATTGTTGTCTCCGATGCCGAATGAGCAGGATTTTGCTGTTAATGTTTGCACAGTACTAGCCGCTGATCATTCTAACCGGCTTCCGTTGAGCACAACACCTCATATACTGGATTTCTTGCTAGCTCATGCTGGAGTTTATAATCACTCAAGCCTCCGCGATACAATCGGCCGCTCATACTTCGAATCTCGCGGTCGGTATCCCCACGAGTTCTGGTCGGAGCGAGCTGGTGGTGGCGGGGCGAGGGAACTGGCCGACGAGACGAAGTTCACCGGGGATCAACCTGAACTGGTCGTGCAGGCATTGGCTGCACATAACACACTCACGGATTGTCTGATGCTGGCTGGTGGTGAAGAAGAGAATATGGAGAAGATTGTTGAGGATGACACCGAGGATTGGGTGACGGAACCTTCAGAGGAAGATCAGCTGTTTGCTCCGACGTTACCAGGAGGCGCCACATGTGTTTACACACAACGTGTACTACAGATAGCTAGCATCGTACGAAGCCTGTCCTTTCACGAGGAGAATGTACAGTACTTAGCCAGGAATACCACTCTTATAAGGTTCTTACTGCTATGTGCTAACTGTTGGGTGGGAACTCTTCGTCAGAGCGGTCTAGACACGCTCGGTAACGTTGCTGCCGAGCTCATTATTAAAGACCCCGCGACATGTTTGATATCTCGTCATGTTCTATCAACCATACAATCTGCGCTCGTATCTCAAGATCGTGCTAGAGTGTTGGCAGCACTGGAGCTCTTGAATAAGTTGGCACAGAACGAAGTCAACGAGGAAGCATTACTCAAAGCATTGGAATCAAAAGTGTATAGCGACGTGTGTGCTCTGCTCACCCTCCGTGATATAATGGTGTTGGTCTGCACCCTGGAGTGTGTATACGCCCTTACCGGTCTCGGAGACCGCGCGTGTGAGGCGGTCGCACGTGTACCGGGACTGCTACACACACTCGTGTCACTGGTTACTGTTGAGGCCCAGAGCTACGGTCCCCGCGCGTGTATCCTCATGCGTGTGGTGGAGACGGTTAGCGGTCCGCCCGCGGTGGACCACGTGCAACCACACACAGTACAGAACAATATCCCCTCCCAACAGGTTCAAGCCCCAAAGCCTCAAGTGGAACCCCCCGTGGCGTCCCCCGCGGCCGCCACCCACACACAGCCTACTACACTACAACAATCCCACATGCAACAACGTACTGTACAAGAAAACGAGCACTTCGCCCAAGCGTGGCTCCGCGCTACGTACGAAGCTCTGCCCGCGTCGGACAACAGCGCGTGCGATGCTGCGGACGTGTACAGGCAGTACCTCGCGTGCTGCACCAAACTGGCTCGCAAGGGAGTCATCGCACCCGCGCACTTCCCGCGACTTGTCAGGACGGTGTTCGGCGGCACGGTTGGGCCAAACACAGTGAGCACTTCTACGGGTGAAACACAACATGTGTACATCGGCATACGAGCGAAGAATATAGCAAATAGAAGTAATCCGCCTGTTGGTCCGTCGTCACCTATATTAAAAGCTCAACTCACTAACAAGCCGAGCGCGACCGTTGAAACAAAGCCGGTCGTGACGCAACTGCAGACGCCAGCGCAGCCCGCGGACAACAGCAACACGTCGCTTATCAAACACCTGTTAGCGCACAAAGTAAGCGCTGCTCACACACACGTCGCCCAGAGACAGCAAAGCCAACAACGTCTACCGACCTCTGGAACAGTGGTTGTACAAACATCTACAGCGACGTCGCTCCAGAATATGGAGGTGGATCCAGAAGCGCTCATCAAATGTACGACGATAATACCCGGAACCGTCACCAGCACGTCTGTTCAGGAGAAGAAAACAGCACAGAACAAGATGCTGGCTGATCTCCTTGAGAAAAAATCAAACCCACCAGTACAGGTTGTACAGATGGGACAACAAATAAATGCACCAACTATACAAATAACGGAAACGGGACAAATAGTTCAAGTTAAATCGGAAAATATGATACAGTTATCGGATTCCGTGCAACCGAGCGCGCCGTTTTTTCAAATTAAGAACGAGCAAGGACAACTGATACAGATCAAAAACGACCAAGGACAGATTATACAACTCAAAAGCGACCAATTACAGGGCATGATTCAAATTAAGAACGACCAAGGTCAGATCGTACAGATTAAAAATGACAATCTAGCACAGTTATTACAGTCTGGTGTTCTACAGAAGAATGAGAAGGATATAGCGGAAAGTGTTGTGACGGATCACTCGTATACGGAACCACCGAACAAGAAAATCAAAGTCGAAGACAAGGCAGAGAATCCCCCGGAAAGCGTTTCAAAGACTGCTGCCAATCTGTACGCGGCCTTAGCTGCCAGCCTCCAGGATGAAGACGATCTGCTTCCACCGAAACAAGAACCCGTGGATGTTATTCAGCCATCAGTATTAGTCGGTACGCCGGAGAACCAATCAGTTTTGATACAAGAACCTATATTACAGGTGCAGCAACCAACATTACAAGTGCAACAACCAACATTACAAGTGCAGCAACCGTCGTTACAAGTGCAGCAACCCACGATACAGATGCAGCAGCCGGCGTTACAGGTACAGGTACAGCAGCCCTTACAAGTTCAACAGCCGATGCTGCAAGTTCAACCAATGGATGTACAGAATATCATGTCCCAGGCTGGACAGATTATATTGCAGGAAAAACAGGTCGCTACTCAGCAGACGCAGTTTGTACAACAGCCCATGCAACTTATAGCAGCACCAAGCACATCACAAGGTGGTTTGAGTTACATAGCGCAAAACATACCCGGTAATATGATGCAGAAAACTATCATAATAGTTCAGGGTACTGGAGGTGGTCCTCTCACACTAACGGTTAACAATCCCTCTGGTTTGGACGAGGCCACGCTAAACTCGCTCATAGCGCAGGCGACTGAGGCGATAACACAGCAGCAAATTATTCAGGTGCAGCAACCAACATTACAAGTGCAACAACCAACATTACAAGTGCAGCAACCGTCGTTACAAGTGCAGCAACCCACGATACAGATGCAGCAGCCGGCGTTACAGGTACAGGTACAGCAGCCCTTACAAGTTCAACAGCCGATGCTGCAAGTTCAACCAATGGATGTACAGAATATCATGTCCCAGGCTGGACAGATTATATTGCAGGAAAAACAGGTCGCTACTCAGCAGACGCAGTTTGTACAACAGCCCATGCAACTTATAGCAGCACCAAGCACATCACAAGGTGGTTTGAGTTACATAGCGCAAAACATACCCGGTAATATGATGCAGAAAACTATCATAATAGTTCAGGGTACTGGAGGTGGTCCTCTCACACTAACGGTTAACAATCCCTCTGGTTTGGACGAGGCCACGCTAAACTCGCTCATAGCGCAGGCGACTGAGGCGATAACACAGCAGCAAATTATTCAGAACTCAGGAGTGATACAATCACAAAGGGTGATAGTCAGTCAGTCAGCGCTAGTCAGCTCGTCACAACCCATAATGCTGAAAACTTCCATCACACAGAATCCACCTCAACTGACGCCAAGCCAGCAACCTATAATAACATCTCAACCACAACCTCACAAAGCACAAATCGTTAACCCTCAACAAATCGTCGTCACCCAGAAACAACCACCTGGTATAATAAGTACGTCATCTGGCAACCAGATCGTCAGCACTATAGTTGGTAGCAACCAGCAAATAATCCAAGGGAATCAGCAGTTACTGCAGGGTAACCAACAAATAATAGCGGTTTCCAACAACCAGCAAATAATAGTTAACACTCCAATGAAACCAACTCATAGAGTTGTCCAAGCGTCAAGGAACCAGGTTACAACAGTTGTGACCAGTAACCAGGCTGTCGTCACAACTGATACAAAAACTGTTCAGAGTTCAGCGAAACCTCAATCGGTGATGCGACAGGTTATAACTCGACAACCAGTCATGGTCGGCAATACCAAGATCGGTGACAAAGAAATGGTGGTCACGCAACCTGTAACTGAGAAGATTCAACAACCAAAGAAGATAGAAACTCCACCGCCACAGACGCCACTTCAGACACAGACGCCTACGACGCCAGGGTCTGAGGACACGCCCTGGATCTGTCACTGGCGGGGATGTGGGAAAACGTTCTCCAGTTCGTCCGAGGTGTTCACTCACGTGGCTCGGACCCACTGTCCCAGTACAGCCGGCGGTGAAGCCCCCTGTATGTGGCTAGACTGTGATCGAGTCCCACGGAAGACATTTGCCTTACTAAACCATCTCACTGACAAACATTGCACTCCAAATGCTCTCAAAGCAATATTCAATTCCCGTCGTCACACCGCGAGCGAGGCCGAGTCTGGTAAGCCCATGTCAGTGGGATATCCGCCGAACGCAGCGTTGGCGGCCTTGAACAAACACGCGGCGGATATGTTCAATCCCAGGGAGCTTATGGATGAAAACGAAGGCCCAGTTACGAAAAGCATTCGACTAACAGCGGCACTTATTCTCAGAAACATAGTTATTTACTCAAACACTGGTAGAAGATTACTACGTTCATACGAAGCGCATTTGGCGTCAATAGCCCTCAGCAACGTGGAGGCATCGCGAACTATCTCCCAAGTTCTGTACGATATGAACAATATATGA

Protein sequence:

>DPOGS206164-PA
MAKSQINTKSRNYVQDKEAFLKELKQFNESKNIPYKIPVVNGVDIDLYLLYSLVQQRGGLSKVNQNDTWETFLRQLHLPHPCVNGSTLLRRIYGMYLEKYERAKGPPGRDDDLDMDDDPRRGRGGGMPRISFASGEPLRTGNRVAGPSERLTLSLLSPMPNEQDFAVNVCTVLAADHSNRLPLSTTPHILDFLLAHAGVYNHSSLRDTIGRSYFESRGRYPHEFWSERAGGGGARELADETKFTGDQPELVVQALAAHNTLTDCLMLAGGEEENMEKIVEDDTEDWVTEPSEEDQLFAPTLPGGATCVYTQRVLQIASIVRSLSFHEENVQYLARNTTLIRFLLLCANCWVGTLRQSGLDTLGNVAAELIIKDPATCLISRHVLSTIQSALVSQDRARVLAALELLNKLAQNEVNEEALLKALESKVYSDVCALLTLRDIMVLVCTLECVYALTGLGDRACEAVARVPGLLHTLVSLVTVEAQSYGPRACILMRVVETVSGPPAVDHVQPHTVQNNIPSQQVQAPKPQVEPPVASPAAATHTQPTTLQQSHMQQRTVQENEHFAQAWLRATYEALPASDNSACDAADVYRQYLACCTKLARKGVIAPAHFPRLVRTVFGGTVGPNTVSTSTGETQHVYIGIRAKNIANRSNPPVGPSSPILKAQLTNKPSATVETKPVVTQLQTPAQPADNSNTSLIKHLLAHKVSAAHTHVAQRQQSQQRLPTSGTVVVQTSTATSLQNMEVDPEALIKCTTIIPGTVTSTSVQEKKTAQNKMLADLLEKKSNPPVQVVQMGQQINAPTIQITETGQIVQVKSENMIQLSDSVQPSAPFFQIKNEQGQLIQIKNDQGQIIQLKSDQLQGMIQIKNDQGQIVQIKNDNLAQLLQSGVLQKNEKDIAESVVTDHSYTEPPNKKIKVEDKAENPPESVSKTAANLYAALAASLQDEDDLLPPKQEPVDVIQPSVLVGTPENQSVLIQEPILQVQQPTLQVQQPTLQVQQPSLQVQQPTIQMQQPALQVQVQQPLQVQQPMLQVQPMDVQNIMSQAGQIILQEKQVATQQTQFVQQPMQLIAAPSTSQGGLSYIAQNIPGNMMQKTIIIVQGTGGGPLTLTVNNPSGLDEATLNSLIAQATEAITQQQIIQVQQPTLQVQQPTLQVQQPSLQVQQPTIQMQQPALQVQVQQPLQVQQPMLQVQPMDVQNIMSQAGQIILQEKQVATQQTQFVQQPMQLIAAPSTSQGGLSYIAQNIPGNMMQKTIIIVQGTGGGPLTLTVNNPSGLDEATLNSLIAQATEAITQQQIIQNSGVIQSQRVIVSQSALVSSSQPIMLKTSITQNPPQLTPSQQPIITSQPQPHKAQIVNPQQIVVTQKQPPGIISTSSGNQIVSTIVGSNQQIIQGNQQLLQGNQQIIAVSNNQQIIVNTPMKPTHRVVQASRNQVTTVVTSNQAVVTTDTKTVQSSAKPQSVMRQVITRQPVMVGNTKIGDKEMVVTQPVTEKIQQPKKIETPPPQTPLQTQTPTTPGSEDTPWICHWRGCGKTFSSSSEVFTHVARTHCPSTAGGEAPCMWLDCDRVPRKTFALLNHLTDKHCTPNALKAIFNSRRHTASEAESGKPMSVGYPPNAALAALNKHAADMFNPRELMDENEGPVTKSIRLTAALILRNIVIYSNTGRRLLRSYEAHLASIALSNVEASRTISQVLYDMNNI-