Monarch geneset OGS2.0

DPOGS210492
TranscriptDPOGS210492-TA1158 bp
ProteinDPOGS210492-PA385 aa
Genomic positionDPSCF300186 - 237038-241297
RNAseq coverage166x (Rank: top 51%)
Annotation
HeliconiusHMEL0163381e-4747.62% 
BombyxBGIBMGA012583-TA6e-11469.93% 
DrosophilaCpsf73-PA2e-11958.13% 
EBI UniRef50UniRef50_Q9VE512e-11758.13%Cleavage and polyadenylation specificity factor 73 n=31 Tax=Eumetazoa RepID=Q9VE51_DROME
NCBI RefSeqXP_001605081.14e-14065.38%PREDICTED: similar to cleavage and polyadenylation specificity factor [Nasonia vitripennis]
NCBI nr blastpgi|3071777728e-14065.29%Cleavage and polyadenylation specificity factor subunit 3 [Camponotus floridanus]
NCBI nr blastxgi|3071777726e-13365.29%Cleavage and polyadenylation specificity factor subunit 3 [Camponotus floridanus]
Group
KEGG pathway 
InterPro domain[184-380] IPR0217182.2e-38Pre-mRNA 3'-end-processing endonuclease polyadenylation factor C-term
[27-75] IPR0227126.7e-15Beta-Casp domain
[89-129] IPR0111084.9e-13RNA-metabolising metallo-beta-lactamase
Orthology groupMCL30943 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210492-TA
ATGAGGTGTCGCTGTGATTCCAAATACAGTACTCTTCATTTGAGATCCCACCAGGCACCGGGCATCGATCACTTCGAGGACATAGGTCCGTGTGTGATCATGGCTTCCCCGGGTATGATGCAGTCGGGCCTCTCCCGGGAACTGTTCGAGTCGTGGTGCACGGATCCCAAGAACGGCGTCATCATAGCAGGTTACTGCGTGGAAGGCACCCTGGCCAAAACTATACTGTCGGAGCCGGAAGAGATCACGACTATGTCAGGACAGAAACTTCCGCTGAAGATGTCCGTGGATTACATATCGTTCTCCGCGCACACGGACTACCAACAGACCTCAGAGTTTATCAACATTCTGAAGCCTCCTCATGTGGTGTTAGTTCACGGGGAACAGAACGAGATGTCTCGTCTGAAGGCGGCCCTGCAGCGCGAACACCGCGGCCGCCTCGCCATACACACGCCCAGGAACACGCAACAGCTGGCCCTCACCTTCAGAGGCGACAAGACCGCTAAGGTAATGGGGTCCCTGGCCATGGAGGCGCCGGTGCCGGGCGCACAGCTCCAGGGTGTTCTGGTCAAGAGGAACTTTAACTATCACATCCTGGCGCCCTCCGACTTGAACAAGTACACGGACCTGTCCCAGTCGTCGGTGTCTCAGCGCGTGTCAGTGTGGTGCGGAGCTCCGGTGGGTCTGGTCCGACACGCCGTGATGCGCCTGGCGGGGCCCGTGGTGTTCCTGAGCGACACTCGCTGGAGGCTCTACGGCTGCATCGACCTCACGCTGGACCTGCCGCTCGTCACGCTGGAGTGGCAGGCGGCGCCGGTGTCTGACATGTTCGCGGACGCGGTGGTGGCGGCGCTGCTGGCGGCCCCGGCCTCCGCCCCCGGGCCCGCGCCCAACGCGCCCCTCGCACACAAACTGGACAAGATGCATTTCAAGGAGTGTGTGATCGAGATGTTGTCGGAGATGTTCGGCGAGGCGGCCGTGGCCAAGATGTTCCGCGGAGAGCGACTCACGGTCACGCTCAACGAGCGCCAGGCGCACCTAGACCTCGCCACCATGGAGGTGAAGTGTCCCGAGGACGAGTCTCTGGAGCGCACAATCCAGTCCGCCATCAGCAAGCTGCACGCCGCCCTCTCGCCCGTCCGGCCTCCCGCACCCTGA

Protein sequence:

>DPOGS210492-PA
MRCRCDSKYSTLHLRSHQAPGIDHFEDIGPCVIMASPGMMQSGLSRELFESWCTDPKNGVIIAGYCVEGTLAKTILSEPEEITTMSGQKLPLKMSVDYISFSAHTDYQQTSEFINILKPPHVVLVHGEQNEMSRLKAALQREHRGRLAIHTPRNTQQLALTFRGDKTAKVMGSLAMEAPVPGAQLQGVLVKRNFNYHILAPSDLNKYTDLSQSSVSQRVSVWCGAPVGLVRHAVMRLAGPVVFLSDTRWRLYGCIDLTLDLPLVTLEWQAAPVSDMFADAVVAALLAAPASAPGPAPNAPLAHKLDKMHFKECVIEMLSEMFGEAAVAKMFRGERLTVTLNERQAHLDLATMEVKCPEDESLERTIQSAISKLHAALSPVRPPAP-