Monarch geneset OGS2.0

DPOGS200790
TranscriptDPOGS200790-TA1338 bp
ProteinDPOGS200790-PA445 aa
Genomic positionDPSCF300370 + 78881-80218
RNAseq coverage339x (Rank: top 34%)
Annotation
HeliconiusHMEL0095613e-17173.48% 
BombyxBGIBMGA011898-TA4e-13249.44% 
DrosophilaCG33523-PD6e-6432.09% 
EBI UniRef50UniRef50_E2C1R19e-8437.87%Motile sperm domain-containing protein 2 n=9 Tax=Neoptera RepID=E2C1R1_HARSA
NCBI RefSeqXP_973987.12e-8436.75%PREDICTED: similar to AGAP005556-PA [Tribolium castaneum]
NCBI nr blastpgi|3071966233e-8337.87%Motile sperm domain-containing protein 2 [Harpegnathos saltator]
NCBI nr blastxgi|3071966236e-8238.23%Motile sperm domain-containing protein 2 [Harpegnathos saltator]
Group
Gene OntologyGO:00051987.8e-18structural molecule activity
KEGG pathway 
InterPro domain[78-229] IPR0012513.8e-28Cellular retinaldehyde-binding/triple function, C-terminal
[243-393] IPR0089621.1e-25PapD-like
[261-362] IPR0005357.8e-18Major sperm protein
[1-78] IPR0110746.8e-08Phosphatidylinositol transfer protein-like, N-terminal
Orthology groupMCL14075 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200790-TA
ATGTCGAAAATAGCGGAGATACGGACGTTATTCGAGGCGAAGATCAAAGATGGTATCCCAGATCCGCCCGGGGAATTCGATGCTCGCGATCTGGGTAGAGTTAGCAGTGATAAATATTTGTACCGGGTGTTGGAACACTCCCGGAATAATGTCCAGCAGGCTGTTGAAATGCTTTACGAGATCATGGTCTGGAGAAAAAAGATGAACGCCAACGAAATCAACGAGAACACCGTTAATTTGGATTACCTGAAAGAAGGTATCTTCTTCCCGCAAGGCAGAGACATCGATAGCTGTTTGCTGTTCATCATGAAGTCGAAACTGTACCATAAAGGACAGAAAAACGTAGACGAAGTCAAGAAAATTATCATATACTGGTTGGAGAGAATCGAAAGAGAGGAAGACGGCAAGAAAATTACTCTCTTCTTTGATATGGACGGCTGTGGTCTCAACAACATGGATATAGAGATTATCATGTACATGGTTACGTTATTAAAGAATTATTATCCTAATCTTATTAATTACATCATCATATTCCAACTGCCCTGGATGCTGTCGGCCGGGTTTAAGATTGTCAAGGGCATCCTTCCCGCCGAAGCCATCGAGAGACTGAGAACAGTGAATAAAGATAAGCTGAAAGAGTTAGTGGCTCCGGAACAGGCGTTAGTCAGTTGGGGCGGCAAAAACGAGTATGTATTCAATTTCTTTCCAGAAAATAGGATCAGTGTTGATAACACCAGCAAATCCTCGACCCTTGATAGTCAACATTCTTTGGGTGAAATGTTGAGCTTGAACCCGGGTAAGTTATTAATATTTAAGGTCGAAAATGACAGGATATGTGCTCAATTAACGATAACAAACATGGATGACAGTGTTATATCATTTAAAATAAGAACAACTGCACCAGAAAAGTATGTCGTTAAACCAAGCTCAGGTATTTTAACGAGCAAAGCATCACAGACTATTCAAATACAAGTAAACTCGGGGTTCCAAATCAACTCGGTGGAAAAGGACAGGTTTCTGGTGGTGTCGATGCAGATACCGAGTGCTGATATATCGGCCAAAGAGATCAGTGAAATGTGGAAAACCATTGGCTGCAAGGCCGACGAGTACAGACTGAAGTGTTCAACAGTCAATATGTTGAAGTCGGAGCCAATACAGGAGAAACCGAGCCACGAGCATGACTCTATAATGTATAAATTGAACAATCTTCAAAACAATCACAAGATGCTGGTGAAGAACATCAAAACCCTGAGGATGTACCAGTATGCGACCTTATTCCTGACATTTCTCAGCCTGAGTCTGTGTTATGTAACATACAACAAGGATTGCCAGCTATAA

Protein sequence:

>DPOGS200790-PA
MSKIAEIRTLFEAKIKDGIPDPPGEFDARDLGRVSSDKYLYRVLEHSRNNVQQAVEMLYEIMVWRKKMNANEINENTVNLDYLKEGIFFPQGRDIDSCLLFIMKSKLYHKGQKNVDEVKKIIIYWLERIEREEDGKKITLFFDMDGCGLNNMDIEIIMYMVTLLKNYYPNLINYIIIFQLPWMLSAGFKIVKGILPAEAIERLRTVNKDKLKELVAPEQALVSWGGKNEYVFNFFPENRISVDNTSKSSTLDSQHSLGEMLSLNPGKLLIFKVENDRICAQLTITNMDDSVISFKIRTTAPEKYVVKPSSGILTSKASQTIQIQVNSGFQINSVEKDRFLVVSMQIPSADISAKEISEMWKTIGCKADEYRLKCSTVNMLKSEPIQEKPSHEHDSIMYKLNNLQNNHKMLVKNIKTLRMYQYATLFLTFLSLSLCYVTYNKDCQL-