Monarch geneset OGS2.0

DPOGS212234
TranscriptDPOGS212234-TA3402 bp
ProteinDPOGS212234-PA1133 aa
Genomic positionDPSCF300263 + 211212-219709
RNAseq coverage59x (Rank: top 68%)
Annotation
HeliconiusHMEL0167970.084.90% 
BombyxBGIBMGA004446-TA0.074.77% 
DrosophilaCG10137-PA2e-13338.74% 
EBI UniRef50UniRef50_Q9VIU54e-13138.74%CG10137 n=11 Tax=Diptera RepID=Q9VIU5_DROME
NCBI RefSeqXP_973537.13e-14943.25%PREDICTED: similar to CG10137 CG10137-PA [Tribolium castaneum]
NCBI nr blastpgi|3504195552e-15043.48%PREDICTED: centrosomal protein of 104 kDa-like [Bombus impatiens]
NCBI nr blastxgi|3504195552e-14742.01%PREDICTED: centrosomal protein of 104 kDa-like [Bombus impatiens]
Group
KEGG pathway 
InterPro domain[1-153] IPR0089797.9e-07Galactose-binding domain-like
Orthology groupMCL14968 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212234-TA
ATGCCGAAACGTATACCCTTCCACATAGTTTACGCAACCAGTGAAGATAGTTCATATCCAGCATGCGAGTTGAACGCCCAGGGTCCTGCGGCTCGCGGATGGCGGAGCGCTGGTCCTCCGCCCCACGAGCTCCTGCTGCGCCTCACCGCCGTTACCAGCATACACAAGCTACAGCTTCTAGCTCATCATCAGCTGATACCTGCTTGCGTAGAAGTGTTAGTGTCTGGAGGTCTGCTGTCAGAGGGAGCTGCGACACCGTGCGGAGCAACTTACACTAGTGTCGGTAGAGTGACACTCGCCAAACCAGCGCCCCAAGCACGCACTAGAGAACTAAGATCGGCTGCTCTGCCCGAGCCAACGGTAGCTCGCTTCGTAAAACTGAGGCTATCTGGACCACATCCACCAGCAAAAGACGATGAGCAGGTTGCGTTAATGGCTGTAAACGTTCTTGGTGATGAGGTGGAAGACGTCGCCAAATCGTTGCCAACAACAAAAGCTGAGGTGTGTTTCTCGCCTTACGATGACCTGGCCTTCGTTATGTATGTGGACAATGAAATTGCGGATCTCGTTCGTAATTTAGATGAAAAGAAAAAAACAGCTGTATGTGAAGAACGATTCGAATATGCACGACGGCTCAAATCAGCTGGTCAGGCTTTAGCTGCTGCGGGCATCAGGATCGGGAGATGGAGACTTCGCAAGAGAACCGCCGCAGCTCGGGATGACTTCGAACTGGCGAGACGCATGAGAGACAGAATAGCAGACGCACTGATCGGCGTCCAAGAAGACCCAGAGTTGAGGAGACTATTTGAAGATGATGGACCGGACACTCGCAACGACTCTTCTATGCCCCAAGCCTACGACTTCTCCCACCATCTGTCGCCGTCCGTCGCTATGGGAGTTCATAGCGTCGAAATTCCCTCGCCTGTACCGCCCATCGAACATTTACCAGAAAACGAATTCAATGGAGATCACATCGACAGTCATAATATACTCGCTTCACCCGTCCATATTCTTGAAGATGAAACCGAAGTACCAGAAGAACCGGCCCAACCAGATGAACCGATCCAAGAAGATAAAACTGAAGCTCAAAAGATAGAAGAAGAATTAAGAAAGGAGACTGAAAGTCCCCGTAGAAGTATAACTCCTACTGCCTCTAATGGTAATAGAGCATCAGAACTAAGCTATCCAGGTACATTAGTGAGACGAAGAAACAAAAGTGCTGGTCCCAGGTCTACTTTTGAAGCTTATGAAGAAAGATTATTGCCTGCACTCAGACATTCACATACAAACGAATACCTCCGTGAGGCCCGTGAAGAAGACTGCACAGGAAGCTCTTCTTCACATCCTCGTGTAGTACACAAGTTGAATGAGCGGGAACGAAAACAGGCCGCGCTGCCGATACTTATATTTGGATATCCTTTGGTTGAAAAATTCTTCTCCAAAAGCTATTTGGACAAGGAAGAAGGTCTGGCGCGCCTGCGAGCTGAGTTGACGTCACCATCGAACGGCAGCACCAAGACGTCTCCGAACAAAACAGCGCGAGCAGCGGCGACTTTGCTCCAGAGAGTTCTGAGAGATAAAGTATTCTCAGTCTACAGTCAAGCCAATGAAGTTGTCAGAGTGCTTTTCAAAGAATTCGTCCCTGAAAGGGTTTGCGCAGCGGAAGTAGGTCGATGTCTGGACAAACTCCTCCCTGAACTGCTGCGTGCTTGTGGGGACCCCGCCCCACGCGTGCATTCAACGGCTCAACACACCGTGCTCACAGTTGCTGACTGTCCTCTAGTCAGAAGCCTACACACAATTCCACAACAGCTTGTTCGACCTGTAGCTGCTTCCATGCATCCTCGACTAGCTCTCTCTCGTCTTCAGATGCTGGAACAACTCATCCTGAGCCATGGAATCTCGACCGACAAGAATAGTGGTCTGACGGTGCGTCGTCTAGCGGAGTGTGGTGCTGCAGGGGCTCAACACGCAGCGGGCTCAGTCAGAGCTGCTGCTGAAAGAATTCTCTTAGCAGCATACGCAAGATCCCCTAGAGTTGTCAGAGCACAACTTCCGCCAGACGATGCTGTCACCAGAAGAAATCTAATTTACAGACACCTCTTTCAACAATTTGATAGAATTGATATGCAGAAAATGCTAAATCAAGCACCTACAGAAGAACAACTTCTTAATGGAGATCAGTCCATTGCTGATTCAAACTTAGAAGCTAGCGTAACACAGTCTACACGAAGCGGGACTACGGTTAGTGGAATGACCACATCTTATGGAATGACGTCTTCTATGGATGCCACATCATCCTATAGCTTAAAATCAAGTGCCAGTGGTGGCACCCTGGCTCCTTCTAGTTTAAGTGGAAGTTTTACAACGTCGAGAACAAAAAGCAGTTTAAAAAAAACACCCACTAAAAAATACACACCGACAAAATCATCCAAAGACGCTACCAATTATCCTGGCTACAACAAACTAAGACTTGATAGTGCCATTAGTCCAAAACATTCCCCAAGATCATCAGTCGGTGGGAATGAAAAGGTCCATTTCCAGGAACGTCAAACGGAGGAAGTTGTGTTCCGTCGTACAAGCAGGAACTTAGAAAACCGCCACTCCATGATCCACTACGATCATGACTTGTCTAAACCCCAACTGAAAGAACGTCCAGTCACGGTTTACGAACCTCTACATTTAGAGTATAGAGACTCCCCTACTATAGGCTCGCCAAAAAATTCCAAAAATGACAACCGAAGCATGGACTCCCTTCCTATGGACTCGCCTCAAATGTCAAGAAACGATATGAGATGCGACTCTGATAGCAGAAGTTTGGATTCCCCTAAATTAAAGGCCGACTATTTTAGAGATGTGGGCTTGGAATCCCCAAAATTAGTAGCCGGGGTTAGAAATTTGCATTTGGATGAACAAAGCCAATTGGATGAAAGTGGATATTATAGTCCAGGACGAAGACAGCAGACGCAAAACAATGAGCCATACGAAGCTTATGAAGGAGTAGCAGCTGATGCTAGCAGTGAAACCACGCCGGAGCCAGTAACGAGCACATCTTGCACCTGGTGCGGTAGACGCGTGCGCACTGCTGCATTGGAGGCACACTACTGGCGAAGGTGCGTGCTTCTCGCTCGATGCCCGCACTGTCATCTTGCTCTAGAAGCCCGGGCTCTACACTCGCATTTACTGGAAGAGTGCTCGCTTAGCGAAGGATTGTGGAAGGCGTGCCAGAAATGTGGCGCGGCCTTACGTTCAGACGAAAGTGAATATCACGTCAACTGCACACCTTTAGGCTTGGATGAGTGGAAGTGTCCGTACTGTTTGACCAACATATTAGCTCGCGACCTTCCTTGGCAACGTCATCTGATGCAGTGTCCTCGCAACCCGAGACTAACACAACACTAA

Protein sequence:

>DPOGS212234-PA
MPKRIPFHIVYATSEDSSYPACELNAQGPAARGWRSAGPPPHELLLRLTAVTSIHKLQLLAHHQLIPACVEVLVSGGLLSEGAATPCGATYTSVGRVTLAKPAPQARTRELRSAALPEPTVARFVKLRLSGPHPPAKDDEQVALMAVNVLGDEVEDVAKSLPTTKAEVCFSPYDDLAFVMYVDNEIADLVRNLDEKKKTAVCEERFEYARRLKSAGQALAAAGIRIGRWRLRKRTAAARDDFELARRMRDRIADALIGVQEDPELRRLFEDDGPDTRNDSSMPQAYDFSHHLSPSVAMGVHSVEIPSPVPPIEHLPENEFNGDHIDSHNILASPVHILEDETEVPEEPAQPDEPIQEDKTEAQKIEEELRKETESPRRSITPTASNGNRASELSYPGTLVRRRNKSAGPRSTFEAYEERLLPALRHSHTNEYLREAREEDCTGSSSSHPRVVHKLNERERKQAALPILIFGYPLVEKFFSKSYLDKEEGLARLRAELTSPSNGSTKTSPNKTARAAATLLQRVLRDKVFSVYSQANEVVRVLFKEFVPERVCAAEVGRCLDKLLPELLRACGDPAPRVHSTAQHTVLTVADCPLVRSLHTIPQQLVRPVAASMHPRLALSRLQMLEQLILSHGISTDKNSGLTVRRLAECGAAGAQHAAGSVRAAAERILLAAYARSPRVVRAQLPPDDAVTRRNLIYRHLFQQFDRIDMQKMLNQAPTEEQLLNGDQSIADSNLEASVTQSTRSGTTVSGMTTSYGMTSSMDATSSYSLKSSASGGTLAPSSLSGSFTTSRTKSSLKKTPTKKYTPTKSSKDATNYPGYNKLRLDSAISPKHSPRSSVGGNEKVHFQERQTEEVVFRRTSRNLENRHSMIHYDHDLSKPQLKERPVTVYEPLHLEYRDSPTIGSPKNSKNDNRSMDSLPMDSPQMSRNDMRCDSDSRSLDSPKLKADYFRDVGLESPKLVAGVRNLHLDEQSQLDESGYYSPGRRQQTQNNEPYEAYEGVAADASSETTPEPVTSTSCTWCGRRVRTAALEAHYWRRCVLLARCPHCHLALEARALHSHLLEECSLSEGLWKACQKCGAALRSDESEYHVNCTPLGLDEWKCPYCLTNILARDLPWQRHLMQCPRNPRLTQH-