Monarch geneset OGS2.0

DPOGS202335
TranscriptDPOGS202335-TA3324 bp
ProteinDPOGS202335-PA1107 aa
Genomic positionDPSCF300032 + 685702-810425
RNAseq coverage112x (Rank: top 59%)
Annotation
HeliconiusHMEL0100470.083.13% 
BombyxBGIBMGA005001-TA1e-2173.53% 
DrosophilaCG16779-PA2e-5268.15% 
EBI UniRef50UniRef50_D6WP345e-13635.52%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WP34_TRICA
NCBI RefSeqXP_002429833.11e-5538.24%rest corepressor corest, protein, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2700108452e-13535.52%hypothetical protein TcasGA2_TC014528 [Tribolium castaneum]
NCBI nr blastxgi|2700108456e-14334.95%hypothetical protein TcasGA2_TC014528 [Tribolium castaneum]
Group
Gene OntologyGO:00055156e-11protein binding
GO:00036762e-09nucleic acid binding
GO:00036773.3e-07DNA binding
KEGG pathway 
InterPro domain[936-1003] IPR0090576e-11Homeodomain-like
[220-254] IPR0130872e-09Zinc finger, C2H2-type/integrase, DNA-binding
[859-910] IPR0009495.6e-09ELM2 domain
[951-999] IPR0010053.3e-07SANT domain, DNA binding
Orthology groupMCL18296 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202335-TA
ATGGTAACTCCAGTGAAATGTCTGTATCCAAAAATATCTACTTGTTTACTCAACCGGCCTATAACTGCTGTGGTGATACAGTCGTTGGAGGTCGCGACAACGCGACTGCCACTGCCGACATACTCACCGCGCGCCGTACTGATGTATAGCGCGGCGGTGGGCGGAGAAGGTGTCGCGCTGCAGCTCCGAGAGGCGACACCTTCAGAAATAGATGAACGGCTCTCCATAGCCTGTCACGAGCCTGAGTTACTGGCTGACCTGCTTCACGCTGCTGACATAGACGGTGAGCATGCAGGTCTCACGGAACTTCTTGGTGCATCGTCTCCAGATCTGGCTTTTTCTTCTGACGCATCAGATAGCCTGCCACTACCAGAACACGATAATGAATGCACGACATCATCATCGTCAAGTGTGAGCGGTGTCACGAGGGTGTCTGTAGCGGTGACCGGAATAGAGTCAAAGACCTTGGCCAAAAAGGTCCGACCTAAACCAGCGTCTCCGAACCGCCAGGGGCCGCAGCAATGTCAAGTTTGCAACAAAGTGTTCGGCAACGCGTCTGCACTAGCGAAGCATAAGCTGACGCATAGCGATGAGAGGAAATATGTCTGCATCACTTGCGCAAAAGCTTTTAAAAGGCAAGACCATCTGAACGGGCACATGCTTACGCACCGCAACAAGAAACCATACGAGTGTAAGGCGGATGGGTGCGGAAAATCGTATTGCGACGCTCGGTCTCTTCGACGTCATACTGAAAATCATCACCAGCCTCCCTCTGATAAGAGCACCTCATCAGAATCGAGTGTGGATCGGGAGACGGCGCGAGTGACGTCACCCTCTTCGCCCTCTCGCACTAACTCTGCGACTTCTAATATGAACAGCTCGGAGAAATCTCGCAATGACAGCAGCCCGCCGCACACACCGCCGCCGAGAGCACGCAACAAACCGAAGGCTAAAACTACGCAACAGAGCAGCCCGAAATGCGGTGGCAGCAGTGTGAGCCGTTCAAGTGCGACTGGCAGTTCTGGACTGATTCGAACAACAGATGTCAAGCCCGTCGAGTGTAATTTATGTCACCGCAGATTTAAAAACATCCCCGCCCTCAATGGCCACATGCGACTACACGGTGGATACTTCAAAAAGGATTCAGACAACAAGAAGTTAGACAAGAAAGAATCAACTGGTCCTCCTTTACAAACTGCCTCTGTATCCGTGAGAGCTCTGATCGAAGAGAAGATCATAAGTCGTCGAGGTGCTACCGTATCACAAGCACCAACCTCAGGTACAACAACCGACACAATATCTCGATCGGGGTTTATTGTTCCAGCCCCACCACCTCTATCAACTATTAAAACTTATTCGTCACCAGTAACAACGGTTACTACTTCAGCACCATTTGTTTCCCCCCGTGCACCGCCAGCAGTGACAGTATCAAATACAACTACATTAAATAGAGATTCTACGCTCATTGAACTTTTAAGAAAAGGAAGCTCAAAGGTTGTTAAGCGCTCCGCATCTGATCCTGGCCAGGCGTCACCACAGCAACAAGACTTTACTTTCCGACCTGAATTATTTGGCGTATCGTTTAACTCAGACGATGGTTATTTTTCACCTGCTTTAAATGACGATACTTTCCAATTTGCAACAGCTTCTGATCATTTAGAAGAACTTGCGTCTCTTGAAGACTACGCAACTGTTGCTGCCTCTATACGAGAGCGGTCACCCGTAACTTTCCCGTCCAGTCGTCGTTTAGCTGCAGTTCTAAATTCTCCCCTACCAGAATCTTTAGCTGATTTTGGAGCTTGTCATGGAGGATCACCTGTTCCATCTCCTGGTATCGCATATGCTGATAGTTCACCGGGATTGTCTTATACAACTGGGGATTCACCAGGGTTAGCTTACACAGCCACATCTCCTAGTGGTAGTTATTCCAATCAACCGGAACCTTCACCAAGTTTCGCTTATCCAACACCTCCAGCATCTCACGATGCCCATTCACCGGCCCATACAGTCCCAAGAGCATCGTCGCCACTATCAGCTGCATTTTTTACGGCTACAATGTCCAGTCAAGAAGAGGTGGAAGAGGCCCTTGAAGAGGTTTTGCCTGAAGAATGCCGATCTTTAGATGCTTATGCTTTGGAATCTTCAACAACTACAAGACGAATTATGCTTAATTCCGAAGATCCCCTTTTGTCCAGTAGTCCTCGGGATTTCCCCAATCAAAGAAGTATTCGTCGCCAGAATAGAGTAGCAACACCTATGGCAGCACCTATGCAAACATGGCAACAGGACACAACAGCGCTTCAAGTGTGTGTGGAAGGTCGAGACCCACTACCAGCAGTATTTCTTAGTCCGAACAGCGTACCGGCGTCGCCTCAACAACGCAAACGTCGCGCGTCTCCAGCTGGTCCATACAAGTCGCGCATGCGTCGTAGAGTCAATCACTACACACCACAACCAACCCTACCACCAGATAGAGACGGTTGCGGTCTATTCGTAGAAATCAGAAATGCCCTTCAAGCCAACCTTGATATTACACTTGAAGATACACCACTGGAAGAGAATCGCTTACCACAAATTAATATCGGATCCGATCATCAAGCAGATATACCGGAACTGTGTAACGACCGTATAGATCTACATAGAGCTCCGGAACAACTCTTATGGGATCCTGGTATTAACGACGCACTAGATGACAATGAAGTCCGCATGTTCATGGAGCTGGCGATGTGTGCGGCGATGCCAGTTGGAGGACATACAAGAGAGAGTGCTTTACAAACATTGGGAGAGTGCGGTGGTGACGTCCGCATTGCGACGCTTCGTCTCATGTCTCGACCAGCTGCGCCCTCACAACAGGAGTCACGCTGGACTCCTGACGAAGTGGAGGCTTTCCTAGCTGGACTTGGGCAGTTCGATAAAGATTTCTTTAGGATTTCACAACTGGTAAGATCGAAGGACTCAAAACAGTGCGTTCAGTTTTATTACTTCTGGAAGAAGGTGACAAAAGACTACAAGACACTGTATTTAAGAAGTTGGGCCGATTCTCAAGCTCAAGGTTCCGTAGCACAGATTTCGTCTCGAACGACCTCATGCGCCTCGCCAACAACAGCGTACGAAGGCGAGGAGTTTCCGTGCAAAATATGCGGAAAAGTATTTAACAAAGTTAAAAGTCGTAGCGCACACATGAAGTCGCACCGGCCACTCGACGCCGAGCCCAAACGGTCAAAACTCGAAAAACCTTATGAAAAGGTCGAGAGATCTGACGAGAGGTCGATCAGATCTGAGCGACAGCAAAACACAAAAGCAAACAGCTCAGTAACTGACTGA

Protein sequence:

>DPOGS202335-PA
MVTPVKCLYPKISTCLLNRPITAVVIQSLEVATTRLPLPTYSPRAVLMYSAAVGGEGVALQLREATPSEIDERLSIACHEPELLADLLHAADIDGEHAGLTELLGASSPDLAFSSDASDSLPLPEHDNECTTSSSSSVSGVTRVSVAVTGIESKTLAKKVRPKPASPNRQGPQQCQVCNKVFGNASALAKHKLTHSDERKYVCITCAKAFKRQDHLNGHMLTHRNKKPYECKADGCGKSYCDARSLRRHTENHHQPPSDKSTSSESSVDRETARVTSPSSPSRTNSATSNMNSSEKSRNDSSPPHTPPPRARNKPKAKTTQQSSPKCGGSSVSRSSATGSSGLIRTTDVKPVECNLCHRRFKNIPALNGHMRLHGGYFKKDSDNKKLDKKESTGPPLQTASVSVRALIEEKIISRRGATVSQAPTSGTTTDTISRSGFIVPAPPPLSTIKTYSSPVTTVTTSAPFVSPRAPPAVTVSNTTTLNRDSTLIELLRKGSSKVVKRSASDPGQASPQQQDFTFRPELFGVSFNSDDGYFSPALNDDTFQFATASDHLEELASLEDYATVAASIRERSPVTFPSSRRLAAVLNSPLPESLADFGACHGGSPVPSPGIAYADSSPGLSYTTGDSPGLAYTATSPSGSYSNQPEPSPSFAYPTPPASHDAHSPAHTVPRASSPLSAAFFTATMSSQEEVEEALEEVLPEECRSLDAYALESSTTTRRIMLNSEDPLLSSSPRDFPNQRSIRRQNRVATPMAAPMQTWQQDTTALQVCVEGRDPLPAVFLSPNSVPASPQQRKRRASPAGPYKSRMRRRVNHYTPQPTLPPDRDGCGLFVEIRNALQANLDITLEDTPLEENRLPQINIGSDHQADIPELCNDRIDLHRAPEQLLWDPGINDALDDNEVRMFMELAMCAAMPVGGHTRESALQTLGECGGDVRIATLRLMSRPAAPSQQESRWTPDEVEAFLAGLGQFDKDFFRISQLVRSKDSKQCVQFYYFWKKVTKDYKTLYLRSWADSQAQGSVAQISSRTTSCASPTTAYEGEEFPCKICGKVFNKVKSRSAHMKSHRPLDAEPKRSKLEKPYEKVERSDERSIRSERQQNTKANSSVTD-