Monarch geneset OGS2.0

DPOGS212811
TranscriptDPOGS212811-TA3006 bp
ProteinDPOGS212811-PA1001 aa
Genomic positionDPSCF300086 - 544057-552707
RNAseq coverage461x (Rank: top 27%)
Annotation
HeliconiusHMEL0081750.072.59% 
BombyxBGIBMGA000753-TA0.064.37% 
DrosophilaClbn-PB0.050.19% 
EBI UniRef50UniRef50_E2BDI60.056.48%Serologically defined colon cancer antigen 1-like protein n=3 Tax=Arthropoda RepID=E2BDI6_HARSA
NCBI RefSeqXP_001658543.10.054.90%hypothetical protein AaeL_AAEL007639 [Aedes aegypti]
NCBI nr blastpgi|3838527460.055.69%PREDICTED: nuclear export mediator factor NEMF homolog [Megachile rotundata]
NCBI nr blastxgi|3838527460.056.93%PREDICTED: nuclear export mediator factor NEMF homolog [Megachile rotundata]
Group
KEGG pathway 
InterPro domain[9-496] IPR0086162.8e-38Fibronectin-binding A, N-terminal
[896-994] IPR0218464.9e-28Protein of unknown function DUF3441
[513-602] IPR0085322.2e-24Domain of unknown function DUF814
Orthology groupMCL15337 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212811-TA
ATGAAGACAAGATTTAATACTTACGATATTGTGTGTATGGTGTCGGAACTGCAAAGACTGGTGGGTATGCGAGTTAACCAGGTGTATGATATTGATAACAAGACATATGTGATCCGACTGCAGAGGTCTGAAGAGAAGGCTGTGCTGCTGCTGGAGTCAGGGAACAGGTTCCACACAACACAGTTTGAATGGCCCAAGAATGTAGCTCCATCTGGGTTTACTATGAAGCTTAGGAAACATCTTAAGAACAAGCGTCTCGAGAAGCTGAGCCAGCTGGGCATTGACAGGATAGTTGAGCTACAGTTCGGTAGCGGTGAGGCCGCGTATCATGTTATACTGGAATTATATGACCGCGGTAACATAGTTCTCACAGACTGCGAGTGGACGATACTGAATGTACTACGTCCGCACGTCGAAGGTGATAAAGTAAGGTTTGCAGTCAAAGAAAAGTATCCTCTTGACAGAGCCAAAACAGACTATGCAGCACCAAATGAAGGTGCTCTCAAGGAGATATTAGGAAAGAGCAAACCTGGTGATAACCTCAAGAAAATACTTAATCCTAATTTAGAATATGGTGCATCAATAATAGACCATGTCCTGCTGCAAAATGGTCTGTCTGGTAATTTAAAGATATCACAGGATCCTAACAAAGGATTTTATGTGGAAAGGGATTTGGGAACGCTAGCAAATGCTCTAAGACAAGCTGAGACAATGATTGAGAATGGAAAGAATCAAATGGCTAAGGGTTATATAATCCAAAAGAGAGAAGATCGACCAAATCAAGATGGCGGCCCGGACTTCTTCCTCACCAACCAAGAGTTCCATCCCCTGCTGTATCTCCAGAACAAAGACCAGGTGTATGTGGAGTATGAAACCTTTGACAGAGCCGTCGATGAGTTCTATTCAGCTCTGGAAGGACAGAAAATTGATCTTAAGACGATTCAAGTTGAACGTGAAGCTATGAAGAAGCTCCAGAACATCCGCACCGATCACGAGAAGAGGCTCAGCAACTTGGAGAAGGTTCAGCTTGAGGACAGGAGGGCGGCGGAGATGATAGCTAGGAATGAACCGCTCGTCGAACAAGCGCGGCTCGCCATACAGACGGCCATAGCTAACCAGATGAGTTGGGATGACATCAAGCTGTTAGTGAAAGCGGCTCAAGACAACAAGGATCCCGTGGCGTCAGCCATAAAACAGCTGAAGCTGAACACCAACCACATTACGTTGTTGCTCAAGGACCCGTATGATGATGATGATGATGATGATGATGATGATGATGACAATGACGGCGGCGGGGACAAGGAGAGGCTGGAACCAATGATGGTTGATATCGATCTGTCTCTGACTGCCTTCGCTAACGCTAGACGTTACTACGATCAGAAACGCAGTGCTGCCAAGAAGCAGCAAAAGACGCTGGAGTCAGCGGACAAAGCTTTGAAGAGCGCTGAGAAGAAAACTAAACAAACGCTGAAGGAGGCTCAGGCCATCAGCAGCATCAGCAAAGCGAGGAGGAACTACTGGTTCGAGAAATTCTACTGGTTCATATCATCCGATAACTACTTGGTGATAGCCGGTCGTGACCAGCAGCAGAACGAATTGCTAGTGAAGCGTTATATGCGGTCTACAGACGTGTACGTCCACGCGGACGTGTCCGGGGCTTCGTCGGTGGTGATTAAATGTCCCTCGGGGCCTCCGCCCCCACGGACGCTCAGTGAAGCGGGACAAGCGGCCGTCGCATACAGTGTCGCATGGGAAGCGAAAGTCCTGACTCGTGCGTGGTGGGTCCACGGACACCAGGTGTCCAAGTCAGCTCCGACAGGTGAATATCTGTCAACGGGCTCCTTCATGATCCGCGGCAAGAAGAACTACCTGCTGCCTGAACACCTGCAGTTCGGATTCAGCTTTATGTTCCGGCTTGAAGATAGTTCCATCGACCGTCACCGCGACGACCGGAAGGCTGTTCAAGCTGATGATGCCAGTGACGTCACATCCGTCATCAGCGCGGACGAACAGGAGATTGTTGTGTCGGATGACGACGAACCTTCAGATAACGAGGATAAGGAGAAAAAATTAAATACAATAGCCGAAGAAGTAACGAAAATAGATCTAGAAGATAATACAGAAGAAAAACCTAAGGAGACAAATGACAAAGATTTGGACCACAAAGATTCAGATGGTAACGAGAACGAGTTAAAAATTAAAGACGATTTGAAAAATGAAGACAGCGAGTCTGATGACGAAACAGGAGTGTTACACACACACGTGAAGGTGGACCACGCTACGGGCGAGGTGTTCGTGGCCTCCAAAACACGGACGATATCCGAAATGTCTGATAAAAGCGAAGAACCCATGACCTTCCCCAGTCTGCCCAAGAAGGGAGGCAAAAAACCCCAGAAGGAAGTTAAGAAGAGAGAAGAAGTTAAAGAGAAGCAAGGACCAAAACGTGGGCAGAAAGGGAAGCTGAAGAAGATAAAAGAGAAATACAAGGACCAGGACGAGGAAGATCGCGCGCTCATGATGGAGATACTCAAGCCGGATAAAAGCGCCAAGGAGACGAAGAAAGCCCAGAAGCAGGTCAGCAAGAGTAAACAGAAGCAGGCCATCAAGAAGATACCACAACCGGCTCCCGTACTGCTGGAGGCGGAGTCAGACGACGAACCGACCCCGGACAATGAGCCCGAGGCGGAGCCCGCAGCGGACGCAGACGCGGAACTCCTGTGTCAGCTGACGGGAGCTCCGCTCGATGAGGACGAACTGCTGTTCGCGGTGCCTGTGGTGGCGCCCTACTCCTCGTTACTCCAATACAAGTTCAAAGTGAAGCTAACCCCTGGCAGCAACAAGAGAGGTAAAGCCGCCAAGACAGCCGTCCAGGTGTTCCTCCGAGACAAAAACACCAGCTCCAGGGAGAAGGACCTGCTGAAGGCTGTCAAGGAGGAAAACATCGCCAGGAACTTCCCCGGGAAAGTGAAGCTGTCCGCACCACAGCTACATAAACATAAGAAATGA

Protein sequence:

>DPOGS212811-PA
MKTRFNTYDIVCMVSELQRLVGMRVNQVYDIDNKTYVIRLQRSEEKAVLLLESGNRFHTTQFEWPKNVAPSGFTMKLRKHLKNKRLEKLSQLGIDRIVELQFGSGEAAYHVILELYDRGNIVLTDCEWTILNVLRPHVEGDKVRFAVKEKYPLDRAKTDYAAPNEGALKEILGKSKPGDNLKKILNPNLEYGASIIDHVLLQNGLSGNLKISQDPNKGFYVERDLGTLANALRQAETMIENGKNQMAKGYIIQKREDRPNQDGGPDFFLTNQEFHPLLYLQNKDQVYVEYETFDRAVDEFYSALEGQKIDLKTIQVEREAMKKLQNIRTDHEKRLSNLEKVQLEDRRAAEMIARNEPLVEQARLAIQTAIANQMSWDDIKLLVKAAQDNKDPVASAIKQLKLNTNHITLLLKDPYDDDDDDDDDDDDNDGGGDKERLEPMMVDIDLSLTAFANARRYYDQKRSAAKKQQKTLESADKALKSAEKKTKQTLKEAQAISSISKARRNYWFEKFYWFISSDNYLVIAGRDQQQNELLVKRYMRSTDVYVHADVSGASSVVIKCPSGPPPPRTLSEAGQAAVAYSVAWEAKVLTRAWWVHGHQVSKSAPTGEYLSTGSFMIRGKKNYLLPEHLQFGFSFMFRLEDSSIDRHRDDRKAVQADDASDVTSVISADEQEIVVSDDDEPSDNEDKEKKLNTIAEEVTKIDLEDNTEEKPKETNDKDLDHKDSDGNENELKIKDDLKNEDSESDDETGVLHTHVKVDHATGEVFVASKTRTISEMSDKSEEPMTFPSLPKKGGKKPQKEVKKREEVKEKQGPKRGQKGKLKKIKEKYKDQDEEDRALMMEILKPDKSAKETKKAQKQVSKSKQKQAIKKIPQPAPVLLEAESDDEPTPDNEPEAEPAADADAELLCQLTGAPLDEDELLFAVPVVAPYSSLLQYKFKVKLTPGSNKRGKAAKTAVQVFLRDKNTSSREKDLLKAVKEENIARNFPGKVKLSAPQLHKHKK-