Monarch geneset OGS2.0

DPOGS206502
TranscriptDPOGS206502-TA3606 bp
ProteinDPOGS206502-PA1201 aa
Genomic positionDPSCF300367 - 18347-31834
RNAseq coverage289x (Rank: top 38%)
Annotation
HeliconiusHMEL0066000.048.89% 
BombyxBGIBMGA012725-TA4e-10148.03% 
Drosophilacnk-PA4e-6934.73% 
EBI UniRef50UniRef50_UPI00022C977F9e-8540.56%UPI00022C977F related cluster n=1 Tax=unknown RepID=UPI00022C977F
NCBI RefSeqXP_001602905.18e-8441.01%PREDICTED: similar to conserved hypothetical protein [Nasonia vitripennis]
NCBI nr blastpgi|3504096813e-8440.56%PREDICTED: hypothetical protein LOC100748006 [Bombus impatiens]
NCBI nr blastxgi|1951238657e-10528.07%GI21034 [Drosophila mojavensis]
Group
Gene OntologyGO:00055153.8e-17protein binding
KEGG pathway 
InterPro domain[3-82] IPR0109933.8e-17Sterile alpha motif homology
[4-78] IPR0137615.7e-12Sterile alpha motif-type
[194-318] IPR0014789.2e-12PDZ/DHR/GLGF
[8-72] IPR0211292e-11Sterile alpha motif, type 1
[764-843] IPR0119932.3e-09Pleckstrin homology-type
[6-74] IPR0016603.8e-09Sterile alpha motif domain
[745-846] IPR0018494.4e-06Pleckstrin homology domain
Orthology groupMCL14139 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206502-TA
ATGGGCAGTTTGAACATAGCGGAATGGACACCAGAGCAAGTAGCGGACTGGTTAACAGGTCTAGGACCGAAAGTAGCCCAGTATGTGCCGGAGCTACAGAAGAAGGCTCTCAATGGCTCGAAACTGCTGACGATGCGCTGTGATGACTTAGAATATCTAGGCGTTCATATCATTGGGCACCAGGAACTTATATTAGAGGCTGTGGAGCACTTACGAAACTTTCACTACGAGTCGTCCCGCGAGTGCGTGCAGCAGCTGGCGGTACGCGTGTCCGGGGCGGCTCAGTCCTTGGCCCGCGCACTGCGCTACCATGGTGACGCGCGCCTGGAGACGGACTCGCTGGCGGACGTGGCCCGCACTGTACACGCCGTCAAGCCCCTCGTGTGCTGGCTGGACCGCTGGCCGCTGTGTTCGGGTTCGCCTCTCGCGGCACGGAAGGCTGCCCTGTTGAAGCTGTCCCTGGAGGCGGCCACGTGCGCCCAGCGGGAGAGGTTCGCGGAGCAGCCGGCCCGCGCCGTGGCCGCCGCCGCTGCCGCCCTCGCCGCACTCGCAGACTACATCATACAGGACGTGTCGGACCCCATGATCCTGCAGCCGGCGCGGGTGGACTCGGTGTCGCTGGTGCAGGGAGAGCGCGCGCTCGGCTTCGAGGTGGTCCCGTCGTTCTGCGGGCACCACCAGCTCGCACACATCAGGTTCGCGTCTCCCGCACACGCCTCCGGGCTCGTGCACGAGGCCGACGAGATCGTCCAGGTGGGAGGGCGCTGCGTGGTGGGTTGGCCCGGGGAGGCCGTGGAGGCGGCGTGCACGCGGGCGGCGCGGGGCGGAGACCTGGCGCTGAAGCTGCGGCGGCGCGGGGCGCGTGCTCTGCCCGCCCTGCCCACTCCGCCGCTCCGCGCGCCCCGCGCTCGGGCCCGTCCTCGTCCTCCGCACGCTCCCTTCACTCTGCACCGGTACGAGCTGGAGTTTCCCCTGTCGGGCGCCGTGCACCCCGCGCCCCGCTCCCCTCCCCGCCCGCCCCGCGCCGACAGCCCGTCCTCCGAAGACAGCGACGCTCTCTCCCCGCCCGCCTCGCCGACGCTCCTACTGCTGCCGGACACCGCGCGTATGTATCCTCCCAAGCCGCGCCTGTCGGTGGTCCGTCGTCACTCAGTGAGCGGGGAGACGCCGGCGGCCGCGAGTCACGCGCTCGCCGTCCACCAGTTATGGCAGCAGTTGCAGCAGCAGCGGCTGGCGTGTGTCGACGGTGACAACGCCCTGTATCGGAGAGATAAGGCCGTGTCGTGTAGTACGGGCCTGCAGTTGTCTCCTCGGCCCCGTACGTGCCTGGTGGTGCCGCGGCCGCTGGGTCAGCTGGGCCAGGTCGCGGGGCCTCCGGCCGCCTCGCCGTGTCGCGGGAAGCTCGACAAGAGTCACTCGACCCCCGCCTACGACTTCGAACCGAGCTCGGAGCCGGGGTCGCTGGCCGCGCAGACCATCCCGGAGTCCCCGACCACTCCAGTCACGGACGCCCCGCCTCACACAGAGAAGGCGGGGCAGATACTGGACTTCAAGAAGTCCAGTTCGCAGATAGAGGAGGCCATCCAGCAGAGGAACCGGCGGGCGAACGGAGACGACGACAAGAACGACGCGTTCACAGAAGACGACACGAAGCTAGAGATAGTGGAGACGGTCAACGAGGTGATGAGGGAGACGGCGGGAGGGGCGAGGGAGGGAGGCAGGGAGGACGCGAGGAGGAGGAGGGAGGCGGCCGGGGACGACACGAGCAGCGACGACCAACAAGCGGCTGGAACCATGAGACGAACGAAACCTCGCATCACGGAACGCTTCCGTCCGGCGGACCCGTCCCGGCCTCCTTCCCCCCCACCTCGCCCCTCCCTCCCTCATCGTCCCGCGCCTCCCGCTCCCCGCGAGCCTCGCCCTCCTCCCCGAGACGCTCCTCAGTACCCTCCAGTGCGCGTCCTCACCCGACCCGCCGAGTCTCGGGACGCCTCCCCGCTCAAGGCCCTGAGGCCAGACATCCCGCAGGGGTCTGTGCCGTTGAGACACATCACCAAACACGACATCAAACTGGTGTCGGCGGAGAGGCGCGAGATGCCCGCCATCAACGGCGAGGCGGAGCGCGCGGCCGCACGGAAGGATGACGTCACGGTGACGGCGGCGAGTCCGGGCGGGGGGGCGGCGGCGGGCGGGGGCGGGCGTCGGTCGGTGCCGGCGCGGTTGCTGTCGGGGCGCGGCGCCTGCGGCAGCGTTGTGCAGCGCGTGCGGGCGGGCGCGGGCACGGGCTGCACGCGCTGGGCCTCCCGCCACCTGCTGCTGGCCCACAACCTGCTGTACGCCTACCGGTCGGCGGAGTGCTCGCGCGCCGCCTGCATGATCTACCTGGAGGGGTTCACGGTGTGCGCGGCGGCCGAGGTCAAGTCCCGCGCGCACGCCTTCAAAGTGTACCACACGGGGACGGCCTTCTACTTCGCGTGCGACTCGCGCGAGGCGATGCTGGCCTGGATCGGTCTCATCCACCGCGCCACTCTGCTGCCGTCGCTGCTCTCCGAGGCGATGGAGTTATCGAAACAGTTTTCGGAAACTGATTACTCCGAAACCGAATCGGACTTAGAAACATCGGAAAGAAGGTTAGAAAAGGAAAAAGAGAGGGAGAAGGAGAGGGAGAGGGAGAAGGAGAAGTCGAAGTTTGGATCGTTGAAGAAGTTAACGCATCGGACGAGTAGGAGCGAGTCACAGGAGAACGTGAGCCAGCAGGCGGCCACCAGCCTCGACAGGAAGTACCTCAGGTTCTTCTCACGGGCGCGCGCCAAGGACGACAACAAGACGCCTAAGAAACCGTCCGGCGTGCCCGTGCCTACCGAGCATTACCGCAGCTACCGCCGCGCGGAGCCCCCTCTACCCTCTCCCCGCGCCCCGCCGCCCGCGCCCCGCCCCTCATCCTCCAGTAGCAAGAAGCTGCCGAAGCCCATCAACTACATCCACGCGTCCAACCCCAACTTGCTGGACTTCGAGAAGAGCGACTTCGTCACCAAGCCGACCATCCAGGTTCCGAAGCCGAAGGTGTCGAAGCCGGACAGCCTGGCGGGGTTCGTGACGCTCGAGGAGTTCATGCTGCAGAAACAGGCAGAGGAGAGGCAGCAGCTGTATTCCGGCCGCGTGCTGCTCGGAGTGGAGCGGGGGGCGCGGGCCGCGGGCAGGGGGGCGCAGGGGGAGGGAGGCGAGCTGCAACGGCGGCTCGACCGGATAGTGCCCGACGTCATCTACGGAGAGTTGGCGCCCGAACATAGAGACAGAAACAAACCGGTCTCCGTCCCTGACAAGGATGGTTACGAGACCCTCGTGTATCCGGACGAGAGAGACGGCCGGACTGACTCGATAGTGTCCGGCTCCTCCCACGGTCCGCACTCGGCCGCCGACAGCGTGTCGACGGGCGTGTCGCGGCTACGGCTCATGTTTGGAGCAAGACGGGACCTCGTGCGACAAGAGAGCAGTCGCTCGGAGCAGTATCCTCACCTGCAGTGTCCGCCGACCTTCCAGCCGGAGACGTACTCCCTGGCGCGGCCTCCGAGAGACGCACACACACGCACACACGCGCGCGACTGA

Protein sequence:

>DPOGS206502-PA
MGSLNIAEWTPEQVADWLTGLGPKVAQYVPELQKKALNGSKLLTMRCDDLEYLGVHIIGHQELILEAVEHLRNFHYESSRECVQQLAVRVSGAAQSLARALRYHGDARLETDSLADVARTVHAVKPLVCWLDRWPLCSGSPLAARKAALLKLSLEAATCAQRERFAEQPARAVAAAAAALAALADYIIQDVSDPMILQPARVDSVSLVQGERALGFEVVPSFCGHHQLAHIRFASPAHASGLVHEADEIVQVGGRCVVGWPGEAVEAACTRAARGGDLALKLRRRGARALPALPTPPLRAPRARARPRPPHAPFTLHRYELEFPLSGAVHPAPRSPPRPPRADSPSSEDSDALSPPASPTLLLLPDTARMYPPKPRLSVVRRHSVSGETPAAASHALAVHQLWQQLQQQRLACVDGDNALYRRDKAVSCSTGLQLSPRPRTCLVVPRPLGQLGQVAGPPAASPCRGKLDKSHSTPAYDFEPSSEPGSLAAQTIPESPTTPVTDAPPHTEKAGQILDFKKSSSQIEEAIQQRNRRANGDDDKNDAFTEDDTKLEIVETVNEVMRETAGGAREGGREDARRRREAAGDDTSSDDQQAAGTMRRTKPRITERFRPADPSRPPSPPPRPSLPHRPAPPAPREPRPPPRDAPQYPPVRVLTRPAESRDASPLKALRPDIPQGSVPLRHITKHDIKLVSAERREMPAINGEAERAAARKDDVTVTAASPGGGAAAGGGGRRSVPARLLSGRGACGSVVQRVRAGAGTGCTRWASRHLLLAHNLLYAYRSAECSRAACMIYLEGFTVCAAAEVKSRAHAFKVYHTGTAFYFACDSREAMLAWIGLIHRATLLPSLLSEAMELSKQFSETDYSETESDLETSERRLEKEKEREKEREREKEKSKFGSLKKLTHRTSRSESQENVSQQAATSLDRKYLRFFSRARAKDDNKTPKKPSGVPVPTEHYRSYRRAEPPLPSPRAPPPAPRPSSSSSKKLPKPINYIHASNPNLLDFEKSDFVTKPTIQVPKPKVSKPDSLAGFVTLEEFMLQKQAEERQQLYSGRVLLGVERGARAAGRGAQGEGGELQRRLDRIVPDVIYGELAPEHRDRNKPVSVPDKDGYETLVYPDERDGRTDSIVSGSSHGPHSAADSVSTGVSRLRLMFGARRDLVRQESSRSEQYPHLQCPPTFQPETYSLARPPRDAHTRTHARD-