Monarch geneset OGS2.0

DPOGS205523
TranscriptDPOGS205523-TA1893 bp
ProteinDPOGS205523-PA630 aa
Genomic positionDPSCF300056 + 21945-25563
RNAseq coverage607x (Rank: top 21%)
Annotation
HeliconiusHMEL0037990.079.03% 
BombyxBGIBMGA000075-TA0.065.12% 
DrosophilaCG42671-PF2e-1629.24% 
EBI UniRef50UniRef50_D6WH861e-2230.11%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WH86_TRICA
NCBI RefSeqXP_001811450.12e-2330.11%PREDICTED: similar to CG18490 CG18490-PB [Tribolium castaneum]
NCBI nr blastpgi|1892358714e-2230.11%PREDICTED: similar to CG18490 CG18490-PB [Tribolium castaneum]
NCBI nr blastxgi|1892358717e-2829.16%PREDICTED: similar to CG18490 CG18490-PB [Tribolium castaneum]
Group
KEGG pathway 
Orthology groupMCL26604 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205523-TA
ATGAGTGTAAGTGTCGAAGACGTTTCGAGTCCTTTATCTGAAGGACAGAGCAGTGCTCTGGATGATGAAACGAAATCCGAAGCTGAGAATAAAAGCGAGGATGTTGAGGCAGGTCAAAAATCGGAGCCCACTACCCCGAAGCCACATCCTATGGCCTTCACGATTGACTTTGGACAGGCGAAACAATTCGATAATCGCAGACTGGAGGAGCTAGCTAAGAAGTCACAGGCTAGACATCAAAGAGTTCAATCCATGTCCGCAGGTCAATCGAGACAGAGTCCTATATCTCATAGACCACCTATGAGCGGCAAACTACCCAGAAAAGCTCAGGGTTACAATTCAGAAGGCTACTTCTCGTCAGATCAGGAAGACAGTGGTGTCTCGAGTTCCGTTAAACTCCGAAGTAGCCCATTAGTTGGTGCTGATAGACCGAATATAACAAGCCCTATAACAAACAGACTAATTCGATCGCCAGTAAACGAAAGTAAACTTATGTCCAAAAGTCCTATAGTTGATAGCATAATGTCCAGAAGTGATAACTACACGCACACTCTTAATCTACCCTTGAAAAACGCTAATAGTTCCTACTTGAGAGACGTCCATAGATCGCCAAGTTACGGTAATATCAATTGCAATAAACTGCCCATGGATATCATTGACGCCAGCCCCGAGGGCGCAATGTTGACTGACATCTCTAGTCCCGAATTAGACATTCTCACCCCCGATAATGGGTTGTCAACCCCTGAACATTCAAAGAGTCCACTACGAAAGAGCAGGGCTTCGTCAGCGACCCCAGATTTATTGTTCAGGAAGTGCCCCCAAAAATCCGAACCGCTGTATATTGATGATGAAGTTGATGGAGAGTCAAACCATTCATCAACAGGTACATACACAATTGAGTGTGATAATTATACTGAAGAACAGAAAGCTAGAATGAGTATCGACAGAACTTTTGGTGTGGAACAGCCAAAAACTGTGCCAGATACATTGAAATGTACTTCAGTCGACCGGGATCCTGAAATAATATTTGACTTCCCAAATCCTAGAGTCATAACAAAAGAGCCGAGACGAGATGTCTGTCTCCGTAATCCAGTGACAAGCAAAGCTGATAATGTACAAAATAAATTATCAAATGATCGGAATGTTCTGGAAATATCATGCTGCTATGAATCTCCCAATGATGATCTGGCTCTCCAAACATCAAAACAAATCAAATCTACAAGAAGCTATTTGGAGAAGATAAAGAATAGAGTTAGAACAATAACGGAGAAGACATTCTCCAAGTCACCACCAAAGTTCGAAGAAGATGCTGATCTCGGTACATTCACATCCGTCACAACTTCTGGGGTACTCAGCTCGAAGTATCCAATCAAAGTGGAAATGCCAAGTAGAAGGCGCTGTAGTCTGACAAAATCGGAAATCGATCGGACAGATTACATCCACAGGTTGTCGAGGGAATCCTCAGTGCCTGTAAATTTACCAAGCAACAAGAACATTCTAGAAGACACCAAATTCGAGAGCACAGCACTCAAAAACGCCCAGAAAGCCCTGAAAGATGAAGTCAATCTAACATCAGATCCGTTGGTCGGCAAACTGTCCTATCTAAGCTTGAACAGGTGCAAGGGTGACAAGACCTGGATTCAGGATTGGGCGGACAGCGTCAAGAAATACAACAATAATGTGCTGACAGATGAAGCAGATCTCAATGCCACCTTCTCCATAGATGACCGTCCTCCGATCAGCCCAAGATTGATACCAACAAAACCCCGAAGTCCGCATGGAACACCAACAAAGATCCCGAGCCCCGTAGGGACTCTTCTACACCGTAGAACCCCGGGCAAGAAATGTGAAGAATTGAGAATATTCACAATGTTCATACCCTTGGACGGATAG

Protein sequence:

>DPOGS205523-PA
MSVSVEDVSSPLSEGQSSALDDETKSEAENKSEDVEAGQKSEPTTPKPHPMAFTIDFGQAKQFDNRRLEELAKKSQARHQRVQSMSAGQSRQSPISHRPPMSGKLPRKAQGYNSEGYFSSDQEDSGVSSSVKLRSSPLVGADRPNITSPITNRLIRSPVNESKLMSKSPIVDSIMSRSDNYTHTLNLPLKNANSSYLRDVHRSPSYGNINCNKLPMDIIDASPEGAMLTDISSPELDILTPDNGLSTPEHSKSPLRKSRASSATPDLLFRKCPQKSEPLYIDDEVDGESNHSSTGTYTIECDNYTEEQKARMSIDRTFGVEQPKTVPDTLKCTSVDRDPEIIFDFPNPRVITKEPRRDVCLRNPVTSKADNVQNKLSNDRNVLEISCCYESPNDDLALQTSKQIKSTRSYLEKIKNRVRTITEKTFSKSPPKFEEDADLGTFTSVTTSGVLSSKYPIKVEMPSRRRCSLTKSEIDRTDYIHRLSRESSVPVNLPSNKNILEDTKFESTALKNAQKALKDEVNLTSDPLVGKLSYLSLNRCKGDKTWIQDWADSVKKYNNNVLTDEADLNATFSIDDRPPISPRLIPTKPRSPHGTPTKIPSPVGTLLHRRTPGKKCEELRIFTMFIPLDG-