Monarch geneset OGS2.0

DPOGS204036
TranscriptDPOGS204036-TA1263 bp
ProteinDPOGS204036-PA420 aa
Genomic positionDPSCF300138 + 68625-71615
RNAseq coverage240x (Rank: top 43%)
Annotation
HeliconiusHMEL0049540.086.34% 
BombyxBGIBMGA004784-TA0.089.58% 
Drosophilacbc-PA1e-15961.85% 
EBI UniRef50UniRef50_D6WQN76e-17067.55%Putative uncharacterized protein n=2 Tax=Pancrustacea RepID=D6WQN7_TRICA
NCBI RefSeqXP_623706.10.073.21%PREDICTED: similar to CG5970-PA [Apis mellifera]
NCBI nr blastpgi|3504216240.073.92%PREDICTED: protein CLP1 homolog [Bombus impatiens]
NCBI nr blastxgi|3504216240.073.92%PREDICTED: protein CLP1 homolog [Bombus impatiens]
Group
KEGG pathway 
InterPro domain[227-418] IPR0106559.9e-54Pre-mRNA cleavage complex II Clp1
Orthology groupMCL13491 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204036-TA
ATGACTGAAGTGCAATTACAAGAGATTAAATTAGATCCCGATTCTGAACTTCGTTTTGAAGTTGAAACGAAAAATGAAAAAGTCGTTTTAGAGGTTAAGAGCGGCTATGCCGAGTTATTCGGCACAGAATTGGTCAAAGGCAAGCCCTATGAATTCCACACGGGAGCGAAAGTTGCTGTGTTCACGTGGCATGGCTGTACAGTGGAATTGCGAGGACGTACAGAAGTTAGTTATGTCGCCAAAGAAACTCCTATGGTTGTATACTTAAATGTACATGCAGCATTAGAACAGCAAAGGGTAGCGGCTGAACACGAAAATACAAGAGGACCGGTGACTATGGTTGTGGGTCCCGGAGATGTTGGTAAATCCACATTAACGAAGATACTCCTTAATTATGCGGTGCGGATGGGTCGACGACCTATATTTGTAGACCTGGATGTTGGCCAAGGACATATAAGTGTTCCAGGAACTATTGGTGCATTATTAGTTGAGCGTCCAGCCTCTATAGAAGAGGGTTTTAGTCAGCAAGCGCCGCTAGTGTACCACTTTGGTCACAAATCACCCGGCGACAACTTGGAGCTATACAACATGATTGTGTCACGTCTGGCTGAAGTTATCGCTGAGAGATGTGAAAATAATAAGAAAGCATCAACGTCAGGAGTGATCATCAATACATGTGGATGGGTGAAGGGAACAGGGTACAAAGTACTGACACATGCTGCCCAGGCTTTTGAGGTCGATGTTATATTGGTGTTGGACAACGAGCGTCTCTACAATGAACTGAAGAGGGACATGCCGAAGTTTGTGAAAGTTGTTTATTTACCAAAAAGTGGAGGGGTAGTTGAACGTTCCTCCACACAACGAGCTGAGGCCCGAGACGCTCGTATAAGGGAATACTTCTATGGAAATCGGACACCATACTACCCACATTCATTTGATGTTAAGTTCTCAGACCTTAAGATCTACAAGGTGGGCGCCCCCTCTCTGCCAGACTCTTGTATGCCTCTGGGTATGCGTTCGTCTGATGCTCTGACCCGCCTGGTGCCGGCCTGGCCGTCTCCGTCTCTGGCGCACCGGGTTCTGGCCGTGTCCTTCGCCCCATCACCAGACGACCACGTGCTCGCGACCAACCTGGCTGGATTCGTTTGTGTTACTGCGGTGGACATGGATCGTCAGACGATGACCATCCTATCTCCTCAGCCTCGCCCGCTGCCAGATACTATACTGCTTCTCTCAGACTTGCAGTACATGGACAACCACTAG

Protein sequence:

>DPOGS204036-PA
MTEVQLQEIKLDPDSELRFEVETKNEKVVLEVKSGYAELFGTELVKGKPYEFHTGAKVAVFTWHGCTVELRGRTEVSYVAKETPMVVYLNVHAALEQQRVAAEHENTRGPVTMVVGPGDVGKSTLTKILLNYAVRMGRRPIFVDLDVGQGHISVPGTIGALLVERPASIEEGFSQQAPLVYHFGHKSPGDNLELYNMIVSRLAEVIAERCENNKKASTSGVIINTCGWVKGTGYKVLTHAAQAFEVDVILVLDNERLYNELKRDMPKFVKVVYLPKSGGVVERSSTQRAEARDARIREYFYGNRTPYYPHSFDVKFSDLKIYKVGAPSLPDSCMPLGMRSSDALTRLVPAWPSPSLAHRVLAVSFAPSPDDHVLATNLAGFVCVTAVDMDRQTMTILSPQPRPLPDTILLLSDLQYMDNH-