Monarch geneset OGS2.0

DPOGS207637
TranscriptDPOGS207637-TA876 bp
ProteinDPOGS207637-PA291 aa
Genomic positionDPSCF300199 + 103962-111252
RNAseq coverage622x (Rank: top 21%)
Annotation
HeliconiusHMEL0119881e-5498.02% 
BombyxBGIBMGA006140-TA2e-9096.34% 
DrosophilaCG5830-PA1e-10568.66% 
EBI UniRef50UniRef50_F7AFK81e-8959.59%Uncharacterized protein n=14 Tax=Coelomata RepID=F7AFK8_CIOIN
NCBI RefSeqXP_623986.11e-11273.13%PREDICTED: similar to CG5830-PA [Apis mellifera]
NCBI nr blastpgi|3454860201e-11273.63%PREDICTED: carboxy-terminal domain RNA polymerase II polypeptide A small phosphatase 1-like [Nasonia vitripennis]
NCBI nr blastxgi|3454860208e-11473.45%PREDICTED: carboxy-terminal domain RNA polymerase II polypeptide A small phosphatase 1-like [Nasonia vitripennis]
Group
Gene OntologyGO:00055151.3e-76protein binding
GO:00167913.1e-66phosphatase activity
KEGG pathway 
InterPro domain[111-254] IPR0042741.3e-76NLI interacting factor
[101-277] IPR0232141.6e-71HAD-like domain
[112-275] IPR0119483.1e-66Dullard phosphatase domain, eukaryotic
Orthology groupMCL11601 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207637-TA
ATGGACGCATCGTCCATTATAACTCAAGTGTCACGGGATGACGAACAATTAAACAACTATGGATCTGAAAGAGGGTCTTCGCCTGGCGGACAGCAAAATAACGAGGATGGAACGCCAGTTTCTGGCTCCACTCCATTGGGGAGCAAGAAGTCAAGTGGCGGTGGAGGTTTCCTACGATCTTTACTGTGTTGTTGGCGTGGGGGCCGCGGAAAAGGGCCCCCCGGTGGAAATGGCACCAACAGTATCGATGGCAGAGCGTCACCCCCCCTTCTTGTTATGTCAGATCATGCACCACGACCTTTATTACCACCAGTGAGACATCAGGATATGCACAAAAAATGTATGGTCATAGATCTCGACGAAACCTTAGTACACAGCTCTTTCAAGCCGATCAACAACGCTGATTTCGTGGTACCAGTGGAGATAGATGGTGCCGTCCACCAGGTGTATGTCCTGAAGAGGCCGCACGTGGACGAATTCCTAAGAAGATGCGGTGAACTATACGAGTGCGTGCTCTTCACCGCCTCCCTGGCGAAGTACGCGGATCCGGTCGCTGACTTATTAGACAGATGGGGCGTGTTCCGCGCTCGTCTGTTCCGGGACAGCTGTGTGTTCCACCGCGGGAACTACGTCAAGGATCTGAACAGCCTGGGGCGGGATCTGCGACGGGTCGTCATAGTCGACAACTCACCAGCATCATACATCTTCCACCCTGATAACGCTGTGCCGGTCGCTTCGTGGTTCGACGACATGACGGACTCCGAGCTCCTGGACCTGATACCGTTTTTCGAGAAACTGAGCAAGGTGGACAGTGTGTACACAGTGTTGCGCAACTCTAACCACCCCTACAACCCGGCACAGAGCTCGCCGACATAG

Protein sequence:

>DPOGS207637-PA
MDASSIITQVSRDDEQLNNYGSERGSSPGGQQNNEDGTPVSGSTPLGSKKSSGGGGFLRSLLCCWRGGRGKGPPGGNGTNSIDGRASPPLLVMSDHAPRPLLPPVRHQDMHKKCMVIDLDETLVHSSFKPINNADFVVPVEIDGAVHQVYVLKRPHVDEFLRRCGELYECVLFTASLAKYADPVADLLDRWGVFRARLFRDSCVFHRGNYVKDLNSLGRDLRRVVIVDNSPASYIFHPDNAVPVASWFDDMTDSELLDLIPFFEKLSKVDSVYTVLRNSNHPYNPAQSSPT-