Monarch geneset OGS2.0

DPOGS210562
TranscriptDPOGS210562-TA1536 bp
ProteinDPOGS210562-PA511 aa
Genomic positionDPSCF300304 + 187215-189910
RNAseq coverage628x (Rank: top 20%)
Annotation
HeliconiusHMEL0095550.093.21% 
BombyxBGIBMGA013454-TA0.090.82% 
DrosophilaCG5451-PA0.075.78% 
EBI UniRef50UniRef50_Q9VE180.075.78%CG5451 n=47 Tax=Eukaryota RepID=Q9VE18_DROME
NCBI RefSeqXP_393446.10.078.71%PREDICTED: similar to CG5451-PA isoform 1 [Apis mellifera]
NCBI nr blastpgi|3838517110.078.52%PREDICTED: WD40 repeat-containing protein SMU1-like [Megachile rotundata]
NCBI nr blastxgi|3838517110.078.52%PREDICTED: WD40 repeat-containing protein SMU1-like [Megachile rotundata]
Group
Gene OntologyGO:00055151.2e-69protein binding
KEGG pathway 
InterPro domain[167-511] IPR0159431.2e-69WD40/YVTN repeat-like-containing domain
[210-510] IPR0110461.3e-69WD40 repeat-like-containing domain
[338-376] IPR0197813.3e-10WD40 repeat, subgroup
[337-376] IPR0016805.3e-09WD40 repeat
[40-92] IPR0065953.1e-07CTLH, C-terminal LisH motif
[6-38] IPR0065941.7e-06LisH dimerisation motif
Orthology groupMCL14191 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210562-TA
ATGTCTATCGAAATTGAATCTGCAGATGTTATCCGTCTGATACAACAATACTTGAAGGAGTCTAACCTCACAAAAACTTTGCAAACGTTACAGGAGGAGACAGGGGTTTCATTGAACACAGTTGATAGTGTTGATGGATTTTGTGCCGACATAAATAATGGTCACTGGGATACCGTGTTAAAAGCAACAGCATCATTAAAGCTGCCTGATAAGAAACTTATGGATTTATATGAACAAGTGGTCTTGGAACTCATTGAGTTACGTGAGCTGGGTGCCGCTCGAACATTGTTGCGCCAAACCCAGCCCTGCTTGCTCATGAAGCAACAGGAGACGGATAGATACATGCATCTTGAAAATATGTTGGCTCGATCATATTTCGATCCTCGGGAAGCATACGGAGCTGGTGGCAAGGAGTGGCGACGCTCGGCGCTGGCCGCAGCACTGGCGGGTGAGGTCTCCGTGGTTCCATCTTCACGTCTCCTAGCGCTGCTGGGTCAGGCGCTGAAGTGGCAGCAGCATCAGGGTCTACTGCCGCCAGGAACCACCATTGATTTGTTCAGAGGCAAAGCTGCTATTAGGGACGAAGAAGATGACCAATACCCGACACAAGTGTCAAAGATTATAAAATTTGGCCAAAAATCTCATGTTGAGTGTGCAAAGTTTTCCCCCGACGGCCAGTACTTGGTGACGGGGTCCGTGGACGGGCTGGTGGAAGTGTGGAACTTCACGACGGGCAAGATCCGCAAGGATCTGCGGTACCAGGCGCTCGAAGAGTACATGAGCATGGAGGAAGCCGTGCTCAGCCTGGCCTTCGCGAGAGACTCCGACACGCTGGCGGCCGGAGCCAACGATGGCCGCGTCAAGGTGTGGAGGGTCGCCAGCGGACAGGTGCAGCGCAAGTTGGAGCGAGCCCACGCCAAGGGAGTCACGTGTCTGCAGTTCGCCAGAGACAATACTCAGATACTGTCCGCCTCCTTCGACCGAACCATCAGGATCCACGGATTGAAGTCGGGAAAGATTTTAAAAGAATTTCGAGGTCATACGTCGTTCGTGAACGAGGCTGTGTTCACCCCGGATGGACACAGCGTGCTAAGCGCTTCCTCCGACGGCACGGTCAAGGTGTGGTCGGTGCGCTCCGGGGAGTGTACGGCGACGTTGAAGCCGCTGGGGTCTGGGGAGCCGCCCGTCAACTCGCTGCTGCTGATGCCCAAGAACCCGGATCACTTCGTGGTGTGTAACAGGACCAACACCGTGGTCATCATGAACATGCAGGGACAGATCGTGCGCTCCTTCACCAGCGGCCGGCGCGAGGAGGAAGGCGGTGCCCTGGTGTGCGCGGCGCTCGGAGCGCGTGGCCGCCTCGTGTACTGCGCCGCCGAGGACCTCGTGCTGTACGCCTTCTGCGCCGCCAGCGGCAAACTCGAGAGGACCATCAATATCCACGAGAAGGCGGTCATCGGTATGACGCACCACCCTCACCAGAACCTGCTGGCCACCTACAGCGAGGACGGACTCCTGAAGTTGTGGAAGCCGTGA

Protein sequence:

>DPOGS210562-PA
MSIEIESADVIRLIQQYLKESNLTKTLQTLQEETGVSLNTVDSVDGFCADINNGHWDTVLKATASLKLPDKKLMDLYEQVVLELIELRELGAARTLLRQTQPCLLMKQQETDRYMHLENMLARSYFDPREAYGAGGKEWRRSALAAALAGEVSVVPSSRLLALLGQALKWQQHQGLLPPGTTIDLFRGKAAIRDEEDDQYPTQVSKIIKFGQKSHVECAKFSPDGQYLVTGSVDGLVEVWNFTTGKIRKDLRYQALEEYMSMEEAVLSLAFARDSDTLAAGANDGRVKVWRVASGQVQRKLERAHAKGVTCLQFARDNTQILSASFDRTIRIHGLKSGKILKEFRGHTSFVNEAVFTPDGHSVLSASSDGTVKVWSVRSGECTATLKPLGSGEPPVNSLLLMPKNPDHFVVCNRTNTVVIMNMQGQIVRSFTSGRREEEGGALVCAALGARGRLVYCAAEDLVLYAFCAASGKLERTINIHEKAVIGMTHHPHQNLLATYSEDGLLKLWKP-