Monarch geneset OGS2.0

DPOGS202001
TranscriptDPOGS202001-TA1329 bp
ProteinDPOGS202001-PA442 aa
Genomic positionDPSCF300060 + 473225-504054
RNAseq coverage642x (Rank: top 20%)
Annotation
HeliconiusHMEL0065801e-8885.28% 
BombyxBGIBMGA010407-TA8e-10188.48% 
Drosophilahdc-PC1e-8660.24% 
EBI UniRef50UniRef50_UPI0002062A6F7e-9941.80%UPI0002062A6F related cluster n=1 Tax=unknown RepID=UPI0002062A6F
NCBI RefSeqXP_975376.13e-13051.52%PREDICTED: similar to Headcase protein [Tribolium castaneum]
NCBI nr blastpgi|3407264536e-13758.48%PREDICTED: headcase protein-like [Bombus terrestris]
NCBI nr blastxgi|3407264539e-14458.26%PREDICTED: headcase protein-like [Bombus terrestris]
Group
KEGG pathway 
Orthology groupMCL15147 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202001-TA
ATGGCTCCACGACGACAAGGCGCCACCGGGGAACGGGAGTCGGAGGACGCTCACGGCTGTTGTGTACCCGGAGAATGTCTCAGACCTCGAGAACCTGTGCGTGTTGAGGACGCAATTCGCGTACTATGCAACAATGACACGTGCGCCGCCGGCCGATACATGCACAGAGAATGCTTCGAGCAATGGGAAGCCGGGGTACTCGCTTATCTGAAATCGTGCGGCCGCGCCAGGTCCTGGTCAGAGAGGCAAAGACATCAAAATCTATGGACGAAGAAGGGCTACGATCTAGCATTTAAGGCCTGCGGTTGCCGATGTGGGCGAGGACATTTAAAGAAAGATCTCGATTGGATTCCACCGGGAACAGCTATCAGGGCCGAGGAAGAGAAGAAGAAGCGTAGGCGAGCTCGTCCTGGCAGGACTCCGGCGGCGGTAGACACGCGCTCAAGGGCGTTCAGTCTTTCAAGTTCAGGGTCATGCTCGCCGCCATCCGCACCTTCGGAGCCATCGACAAGCCCGACACACGCTCCACCACACTCGCTGACACCAGCCAAGAGGAACTCTAAGCCCGAAATAGTATCCGACAGAGTTAGGGCGGGTGCTGGCGCGAATGGCATATTTTCAAGACGACTAGACTTCTCCACATTCAACCTGCTCCCCAAATATAAAGTCAACTCGTATCAGATCAAGGTCAACGCTCTCGTCGAGACGTTTTATGTGTTCCTGGTGTCACGACGTCATTCTAATAATAGAATAATAGGAACGTTAACAGCAAACATGCTGATTGAGGATGAAGGCAACCATGGCAACGATGACACACGTTTGTTCATTCTATCCACCTTGGCTGGACAGCACAAGCCCCGTGTATCATGTGCGTTGTGCAAGGAAACCCTCCACGTCTTCGACCGGTATCCTTTAGTGGATGGCACATTCTTCCTTAGCCCTCGACAGCATACCAGCAGTGCTGTTGAGGTAAAAGTGGAAGGACGTACGCAATTTTTAACTTGCGTATGCATGGGATGCCTAGAGCGATGTGACCCTGAACGCACAATTTGCTGTCGTTTTTGTGGGCAAAAGTGGGATGGCTCCTCGTTGGTGCTCGGTACCATGTACTCTTATGACATCTTTATGGCGACACCTTGTTGCGCTGAAAGACTGAAGTGTAACAACTGCTACAAGCCGCTTCTCCATCCACACCAGCGGCTGAACTACTCCGACTATTCGCATCCACTAACATGCCCGCACTGTCGTGTGGTGGACACGCACTTCGTTAAGCCGTTGTCCTACTGCTTCACCAAACGGGTGTTCCCGTTATTCCAGCAGTGGCCATAA

Protein sequence:

>DPOGS202001-PA
MAPRRQGATGERESEDAHGCCVPGECLRPREPVRVEDAIRVLCNNDTCAAGRYMHRECFEQWEAGVLAYLKSCGRARSWSERQRHQNLWTKKGYDLAFKACGCRCGRGHLKKDLDWIPPGTAIRAEEEKKKRRRARPGRTPAAVDTRSRAFSLSSSGSCSPPSAPSEPSTSPTHAPPHSLTPAKRNSKPEIVSDRVRAGAGANGIFSRRLDFSTFNLLPKYKVNSYQIKVNALVETFYVFLVSRRHSNNRIIGTLTANMLIEDEGNHGNDDTRLFILSTLAGQHKPRVSCALCKETLHVFDRYPLVDGTFFLSPRQHTSSAVEVKVEGRTQFLTCVCMGCLERCDPERTICCRFCGQKWDGSSLVLGTMYSYDIFMATPCCAERLKCNNCYKPLLHPHQRLNYSDYSHPLTCPHCRVVDTHFVKPLSYCFTKRVFPLFQQWP-