Monarch geneset OGS2.0

DPOGS215383
TranscriptDPOGS215383-TA2067 bp
ProteinDPOGS215383-PA688 aa
Genomic positionDPSCF300088 - 468525-472129
RNAseq coverage328x (Rank: top 35%)
Annotation
HeliconiusHMEL0174211e-13147.11% 
BombyxBGIBMGA012396-TA1e-10344.42% 
DrosophilaCG1105-PA1e-6038.73% 
EBI UniRef50UniRef50_E3WVS68e-6239.62%Putative uncharacterized protein n=2 Tax=Endopterygota RepID=E3WVS6_ANODA
NCBI RefSeqXP_001849868.13e-6540.45%arrestin domain containing 4 [Culex quinquefasciatus]
NCBI nr blastpgi|1700444686e-6440.45%arrestin domain containing 4 [Culex quinquefasciatus]
NCBI nr blastxgi|1700444688e-6636.28%arrestin domain containing 4 [Culex quinquefasciatus]
Group
KEGG pathwaycpv:cgd3_26203e-07 
 K03006 (RPB1)maps-> Huntington's disease
    Purine metabolism
    Pyrimidine metabolism
    RNA polymerase
InterPro domain[8-153] IPR0110211.3e-37Arrestin-like, N-terminal
[13-149] IPR0147562.8e-26Immunoglobulin E-set
[178-306] IPR0110221.4e-16Arrestin-like, C-terminal
Orthology groupMCL25964 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215383-TA
ATGGGTTTTGAAGATGGTCAAATAGTTCTAGACAGTCCCAATGGAGCGTACTACTCAGGACAGGCGGTCTGTGGAACATTACATTTTATACAGACAAAAGAAAAAACATTCAGAGGTATATATGTACAGTTTAAAGGATATTGTAAAGTTCATTGGACCACTTCTCACACTAGGACGGTAAATGGTAAAAGTGAATCATATACAACATCTCATGATTCACATGAAGAATATGTGAATGTGAAGACTTATCTAGTTGGAGGAGAGTCAGGCGAACATAGCATCGACCCTGGAACTTATGAGTACAATTTCAGGTTCAATATTCCTGTCAATTGCCCATCATCCTTTGAGGGTCACTTGGGTCATGTGAGATATGAGATTAAAGCCGTAGTAGACAGGGCATTTAAATTTGATCAGGAGAAGAAAGTTGCTGTACGTGTTATGGCACCCCTGGATCTCAATCAAAATCCTTATTGCAAGGATCCTTTGGAGCTGGAGTTCAATGATTCCTACTGCTGCTTCTGTATGGGATCAGGCTCAGCGGACACAATGGTGAAGCTGCCTGTAGCAGGTTATTGCCCTGGCCAGACTATACCCATCGAACTTAAATGTTCTAATCAAGGCAGTGTTGAGATTGATTATATAAAATTGGAAATAACTAAGAAACTAACCTTTACTGCAACCCACGAGCCTGGAACTAGGACAGAGAAAGAAACTGTAGCAGAGATCAAGAAGAATTCTATACCCACAAACACCACCAGAGATTGGACCGTGGAGATGATGGTCCCAGCCCTGGATGTATACAATCTGGACAACTGTCGGTACATTGATGTGGAATACAAATTCAAGGTGACAGTTAACCCTTCTGGATGTCACTCATCCACTGATGGAAGTCGGAAGATTATAATTGGTACAGTTCCACTAGTCGGTTTCCAAGATGACGTACAGAATCCACTCGAAAGTCAAATGCCGCAGCAAACGATTACAGCCGTTACCCAGCAGCCAGTATCAGGAATAAGTTCCTACCCTGGATCCCCATACCCTCCTGTGGTTAATGTTCAACCTTACCCTAATACTGCCTCACCGTACCCGCCAGCTCCGTCACCCTATCCTCAAACCACATCACCCTATCCACAAGCCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACATCACCCTATCCTCCAACCACGTTTCCTTACCCTGAAACCACGTCACCATATCCAAACCCTGCATCACCCTATGCTCCCACTTCTTCACCTTACCCAAATAAATCTCCATACCCTACTGGCAATTCCCCATACCCTCCTGCGAGCCCATCAAGTTCTCCATATCCAGACAACCCTCCCCCATACCCTGGTAACAATCAAGCAAATAATTCCCCATATCCAGCCAGCCCTCACCCCTCTACTAATTCCCCCTATCCGGCTGCTCCCTACCCTGCAACTAATGCACCCTACCCTGATTCTTCCCCTTACCCTCCTAAACAAGACAAGACAAACAAGCCCCTGGGTTTCTCGGTTCCAAGTGGTAATGAAGTCAGTACACCACTTTTGCAGCCGAACCTCGATCCCAGCCCTTACCCAACCATGTCGCCTGGTATACACCACCATCAGCTTTCTGCCGCCGTTGATAACTTCGTTTCTGTCGTTGGCATGTATTCACTGCTCACTGTTTTCCACACACAGCCTACATCGAGCCCCAACCCGTTCGCAGCTGCCAGCGCGCCCGCACTGGACTCGCCCGATACCCGTAGAAATGAAACGAATGTGAATTTAGAAAGCACCTTATCATACCTATTTTGA

Protein sequence:

>DPOGS215383-PA
MGFEDGQIVLDSPNGAYYSGQAVCGTLHFIQTKEKTFRGIYVQFKGYCKVHWTTSHTRTVNGKSESYTTSHDSHEEYVNVKTYLVGGESGEHSIDPGTYEYNFRFNIPVNCPSSFEGHLGHVRYEIKAVVDRAFKFDQEKKVAVRVMAPLDLNQNPYCKDPLELEFNDSYCCFCMGSGSADTMVKLPVAGYCPGQTIPIELKCSNQGSVEIDYIKLEITKKLTFTATHEPGTRTEKETVAEIKKNSIPTNTTRDWTVEMMVPALDVYNLDNCRYIDVEYKFKVTVNPSGCHSSTDGSRKIIIGTVPLVGFQDDVQNPLESQMPQQTITAVTQQPVSGISSYPGSPYPPVVNVQPYPNTASPYPPAPSPYPQTTSPYPQATSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTSPYPPTTFPYPETTSPYPNPASPYAPTSSPYPNKSPYPTGNSPYPPASPSSSPYPDNPPPYPGNNQANNSPYPASPHPSTNSPYPAAPYPATNAPYPDSSPYPPKQDKTNKPLGFSVPSGNEVSTPLLQPNLDPSPYPTMSPGIHHHQLSAAVDNFVSVVGMYSLLTVFHTQPTSSPNPFAAASAPALDSPDTRRNETNVNLESTLSYLF-