Monarch geneset OGS2.0

DPOGS201164
TranscriptDPOGS201164-TA1827 bp
ProteinDPOGS201164-PA608 aa
Genomic positionDPSCF300065 + 656043-660791
RNAseq coverage544x (Rank: top 23%)
Annotation
HeliconiusHMEL0149780.071.43% 
BombyxBGIBMGA003961-TA0.067.87% 
DrosophilaCG4673-PA7e-17051.52% 
EBI UniRef50UniRef50_Q9VBP91e-16751.52%Nuclear protein localization protein 4 homolog n=32 Tax=Coelomata RepID=NPL4_DROME
NCBI RefSeqXP_970927.10.056.72%PREDICTED: similar to nuclear protein localization [Tribolium castaneum]
NCBI nr blastpgi|910881350.056.72%PREDICTED: similar to nuclear protein localization [Tribolium castaneum]
NCBI nr blastxgi|910881350.057.49%PREDICTED: similar to nuclear protein localization [Tribolium castaneum]
Group
KEGG pathwaytca:6595350.0 
 K14015 (NPLOC4, NPL4)maps-> Protein processing in endoplasmic reticulum
InterPro domain[261-565] IPR0077171.7e-91NPL4
[126-259] IPR0077161.5e-50NPL4, zinc-binding putative
Orthology groupMCL13440 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201164-TA
ATGTCAGGAACAAAAAAAATGACGCTACGGGTCCAGTCGTCGGAGGGCACAGCCCGCGTGGAGATGCTGGACACCGAGGTCACATCGCGCCTCTACGAGCGAGTCCACGACACTCTCAACTTGAACTCATTCGGCTTCGCTTTACATAAAGACCGCGCGCGTAAACAAGAAATTTCGTCTAGTAAATCTCGTCAACTCCGAGAGTACGGTCTGCAACATGGAGACATGCTCTATTTGAGTCCTGTCAATGGAACAGTTCTCTTTGACCAGCCTTCTACTAGTTCTGAGCCACTCAACAAACCTTTGACAGAGCTATCAACAGAGGCAGGTCCTTCCACAGTGATTCCCTCAAATGCTGTCAGTAAGGGACCAATAGAACATGAGGTAGATTTACAACTGTACCGTCTCTCGGGCAGCATTCACCGACAGAGAGATGAAAAATTATGTCGTCACAATTCCAAAGGATGTTGTGTGCACTGTTCGGCACTGGAGCCCTGGGATGAGGGCTATCTTAAAGAACACAACATCAAACATATGTCATTCCACGCCTACCTTCGCAAGATGACATCAGGGAAGTTCATTACACTGGATGAACTGTCATGTAAAATAAAGCCAGGCTGCAAGGAACACCCTCCCTGGCCCCGCGGCATCTGCTCGTCGTGTCAGCCGGGCGCTGTGACGCTCACGAGGCAGCCCTACCGCCATGTGGACAACGTGCTACTGGAGCACGCCGCGCCCGTTGAGCGCTTCCTTTCCTACTGGCGCGCCACGGGTCACCAGCGCGTGGGCTTCCTGTACGGCCGCTACGAGCTCCACCCCGACGTGCCGCTGGGTATTCGCGCCCGCGTGGCCGCCGTTTACGAGCCGCCTCAGGAGTGCAGCCGGGACGCCGTCCGCCTGGCGTCGGACGACCACGCCGCGCTCCTCGACCGCCTCGCCGCCCGTCTCGGCCTCGAGCGTGTCGGCTGGATCTTCACCGACCTGCTACCGTTGGATCTAGTCAGCGGCACGGTGCAGTGTCTGAGGGGTGTGGACACGCACTTCCTCTCCGCTCAGGAATGTATCACGGCAGGACATTTCCAGAACGAGCATCCGAACGCGTGTAGGCACGCGTCCTCCGGCTACTTTGGCTCTAAATTCGTGACGGTGTGCGTGACAGGCGACGCCGACAACCACATCCACTTGGAGGGCTATCAGGTGTCGGGTCAGTGCGCGGCGCTGGTGAGGGACGGCATCCTACTGCCCACCAGGGACGCTCCCGAACTCGGATACATTCGGGATTGCTCGCCCGAACAGTACGTGCCTGACGTTTACTATAAGGAAAAGGATGCGTACGGCAACGAAGTAGGCGTGTCGGCGAAGAGGCTACCGGTGGCTTATTTGCTGGTGGACGTGCCGGTGGGCGTGGCGCCCGCAGCAGGCGAGCCCACCTTCGACCCCCGGGCGTCGTTTCCTCCCGCGCACCGGCCCCTGCAGCAGCACGTGCAGTCCCTGAGCGGCCTCCACGCGCACGTGGAGCGCGCCGAGTCGTTCCTGGCAGCGGCCTCCGACTTGCACGTGCTGCTGTTCCTGGCTACCAACGACGCCGCGCCGCTGAGCCTGGAGCAGCTGGCGCCGCTGCTGGACGCCGTCCGCCGCCGCGACGCGTCCGCGGCCGAGGCGTGGCGCGCGTCGCCCGCGGCCGCCGCGCTGCTGGCCCCCCGTCTCTTTCTGTGTCAGGTGACTCGTGTCAGTACAGCTCTTTCTCTCGTGTTTCAGGAACGCCATGTAACGAGCCCGCCGTCGGCCGCCGGTCCTTCCTCGCCGACGGAATATAGAAATTACTAG

Protein sequence:

>DPOGS201164-PA
MSGTKKMTLRVQSSEGTARVEMLDTEVTSRLYERVHDTLNLNSFGFALHKDRARKQEISSSKSRQLREYGLQHGDMLYLSPVNGTVLFDQPSTSSEPLNKPLTELSTEAGPSTVIPSNAVSKGPIEHEVDLQLYRLSGSIHRQRDEKLCRHNSKGCCVHCSALEPWDEGYLKEHNIKHMSFHAYLRKMTSGKFITLDELSCKIKPGCKEHPPWPRGICSSCQPGAVTLTRQPYRHVDNVLLEHAAPVERFLSYWRATGHQRVGFLYGRYELHPDVPLGIRARVAAVYEPPQECSRDAVRLASDDHAALLDRLAARLGLERVGWIFTDLLPLDLVSGTVQCLRGVDTHFLSAQECITAGHFQNEHPNACRHASSGYFGSKFVTVCVTGDADNHIHLEGYQVSGQCAALVRDGILLPTRDAPELGYIRDCSPEQYVPDVYYKEKDAYGNEVGVSAKRLPVAYLLVDVPVGVAPAAGEPTFDPRASFPPAHRPLQQHVQSLSGLHAHVERAESFLAAASDLHVLLFLATNDAAPLSLEQLAPLLDAVRRRDASAAEAWRASPAAAALLAPRLFLCQVTRVSTALSLVFQERHVTSPPSAAGPSSPTEYRNY-