Monarch geneset OGS2.0

DPOGS210401
TranscriptDPOGS210401-TA3420 bp
ProteinDPOGS210401-PA1139 aa
Genomic positionDPSCF300291 + 156796-175947
RNAseq coverage209x (Rank: top 46%)
Annotation
HeliconiusHMEL0215050.081.12% 
BombyxBGIBMGA008411-TA0.084.63% 
Drosophilasdt-PG0.074.89% 
EBI UniRef50UniRef50_D2A2A90.061.93%Putative uncharacterized protein GLEAN_07820 n=2 Tax=Tribolium castaneum RepID=D2A2A9_TRICA
NCBI RefSeqXP_001664235.10.060.06%membrane-associated guanylate kinase (maguk) [Aedes aegypti]
NCBI nr blastpgi|1571385190.060.06%membrane-associated guanylate kinase (maguk) [Aedes aegypti]
NCBI nr blastxgi|3479684070.060.22%AGAP002711-PF [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00055156.1e-54protein binding
KEGG pathwayaag:AaeL_AAEL0140120.0 
 K00942 (E2.7.4.8, gmk)maps-> Purine metabolism
InterPro domain[941-1125] IPR0081456.1e-54Guanylate kinase/L-type calcium channel
[942-1120] IPR0081446.5e-44Guanylate kinase
[292-446] IPR0014523.2e-26Src homology-3 domain
[682-805] IPR0014789.1e-23PDZ/DHR/GLGF
[321-382] IPR0115113.5e-11Variant SH3
Orthology groupMCL13805 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210401-TA
ATGACGAAGATAGTCCAGAGTTTGAACGCATATCTGCTGGATACTTCGGCATGCTGTATGAACGTGGATAGACTGAGACAGTTCTGCTTATGTCGAGGATACCAGAAGGATAGTAACCATGAGATCATGGTGTCCTCGTCGCCACGCGCACAGACGCGCTCCCAACATGGCGTCGACCCAGAAAGAATAAAGGCATATTCGGAGCAACTCCGTCAAAGGAAAGAAGCTGAAGAGCGAATTGCTGCTCAGAACGAGTTCCTGAGGACCTCGCTGCGAGGCTCCCACAAGCTACAGGCCTTGGAGTCCAACCCGCCCTCCTCAACAGCCTTCGTTAATGATGCCTACGAGGAAGACGCCGCTGATGAGGCAGAGCAACTGTACGCGCTCGTCGACTATCAGGAAGTATGTGCAGCGTTATCGAGGGTTCAAAAATCATTATCGTCTTGCGGGGAGGGTGCTCTAGCAGCGAGAGTCTCAGGTGCCGCGGGAGCCCTGCTATGTCCCGCTCTGAGGACAGCTCTAGCTACACGATCCGCAGTCCTAACAGCCGTTAGACATAAACGACCAGGATTGTCTCCACCGCAGACACATCGAGCCACTGATAGACTAAAGGATACTGGCTTCAACAACTACAAAGACGACAAATATTTACCGAATTCGCCGGACGACTCCAGCGAGAATATTAAGATAATTAAAATAGAAAAAACGAATGAACCACTTGGTGCTACAGTGAGGAATGAGGGAGAGGCGGTCATTATTGGAAGGATAGTGAGAGGAGGAGCGGCAGAGAAATCAGGACTCCTACACGAGGGGGATGAACTCCTCGAAGTCAACGGTGTGAGTATGCGCGGTAAGTCGGTGCATGAGGTGTGTCAGGTCTTGGGAGGCCTGGCGGGCACCCTGAGCGTGGTGCTCGCCCCCCGGCCGAGACCCAGACCTCCTCCTGCCTACCGCGTCTTACACGTCAGGGCCCACTTCGACTACGACCCTGAGGACGACGTTTATATACCATGTCGGGAACTCGGTATCAGTTTTCAAAAAGGCGATGTGTTGCACGTCATCAGTCGCGAGGATCCCAACTGGTGGCAGGCCTTCAGGGAGGGAGAGGAGGATCAGACGCTCGCCGGCTTGATCCCCAGCCAGGCCTTCCAGCATCAACGCGAATCAATGAAGCTGTCTTTGGCGGGCGAGGCGGGCTCCGCTGCCAGAAGGTCGAGGAAGGGCGCCACGTTGTTGTGTGCTCGAGCCAGGAGACGCAAGCCCAGGAAACCCACCAGCGAGGCTGGCTACCCGCTGTATTCTGCACAGCCTGACGAGTTCGAGGCGGAGGAGATACTCACGTACGAGGAGGTGGCTCTGTACTACCCGCGCGCCTCCCACAAGCGACCCATCGTCCTCATCGGGCCCCCCAACATCGGTCGCCACGAGCTCAGGCAGAGACTCATGGAAGACTCCACGAGGTTCGCTGCAGCCGAGCAACTCCGTCAAAGGAAAGAAGCTGAAGAGCGAATTGCTGCTCAGAACGAGTTCCTGAGGACCTCGCTGCGAGGCTCCCACAAGCTACAGGCCTTGGAGTCCAACCCGCCCTCCTCAACAGCCTTCGTTAATGATGCCTACGAGGAAGATGCCGCTGATGAGGCAGAACAACTGTACGCGCTCGTCGACTATCAGGAAGTATGTGCAGCGTTATCGAGGGTTCAAAAATCATTATCGTCTTGCGGGGAGGGTGCTCTAGCAGCGAGAGTGTCAGGCGCCGCGGGAGCCCTGCTATGTCCCGCTCTGAGGACAGCTCTAGCGACACGATCCGCAGTCCTAACAGCCGTTAGACATAAACGACCAGGATTATCTCCACCGCAGACACATCGAGCCACTGATAGACTAAAGGATTGTATAGATGTGTTAGGATCTCACACGTCGTCTGGAAGCGAGACATCGGCATTGGCTGCGGAACTGCTGTCAATCCTCGGAGGTCTTGAGGTGGAGAGTGTGATACAGGCCCACGACCAGGCCGCTGCACTACTGGACCCCTCGTGTTTCAATAGAGTGAAGAGAAATAAGACTGGCTTCAACAACTACAAAGACGACAAATATTTACCGAATTCGCCGGACGACTCCAGCGAGAATATTAAGATAATTAAAATAGAAAAAACGAATGAACCACTTGGTGCTACAGTGAGGAATGAGGGAGAGGCGGTCATTATTGGAAGGATAGTGAGAGGAGGAGCGGCAGAGAAATCAGGACTCCTACACGAGGGAGATGAACTCCTCGAAGTCAACGGTGTGAGTATGCGCGGTAAGTCGGTGCATGAGGTGTGTCAGGTCTTGGGAGGCCTGGCGGGCACCCTGAGCGTGGTGCTCGCCCCCCGGCCGAGACCCAGACCTCCTCCTGCCTACCGCGTCTTACACGTCAGGGCCCACTTCGACTACGACCCTGAGGACGACGTTTATATACCATGTCGGGAACTCGGTATCAGTTTTCAAAAAGGCGATGTGTTGCACGTCATCAGTCGCGAGGATCCCAACTGGTGGCAGGCCTTCAGGGAGGGAGAGGAGGATCAGACGCTCGCCGGCTTGATCCCCAGCCAGGCCTTCCAGCATCAACGCGAATCAATGAAGCTGTCTTTGGCGGGCGAGGCGGGCTCCGCTGCCAGAAGGTCGAGGAAGGGCGCCACGTTGTTGTGTGCTCGAGCCAGGAGACGCAAGCCCAGGAAACCCACCAGCGAGGCTGGCTACCCGCTGTATTCTGCACAGCCTGACGAGTTCGAGGCGGAGGAGATACTCACGTACGAGGAGGTGGCTCTGTACTACCCGCGCGCCTCCCACAAGCGACCCATCGTCCTCATCGGGCCCCCCAACATCGGTCGCCACGAGCTCAGGCAGAGACTCATGGAAGACTCCACGAGGTTCGCTGCAGCCGTTCCGCACACATCCCGCGCCCGCAAGGACCACGAGGCGGCCGGCCAGGACTATCACTTCATATCCCGCGCTCAGTTCGAGGCGGACATCCTGAACAGGAAGTTTGTGGAGCACGGAGAATACGAGAAGGCTTATTATGGTACATCCGTCGAGGCGATCCGCGAAGTGGTGAACTCCGGTAAGATCTGTGTCCTGAACCTTCACCCTCAGTCGCTGCGAATCCTGCGAGGCTCCGACCTCAAGCCCTACACCGTGTTCGTGGCGCCGCCCAGCCTGGAGAAGCTGCGGCAGAAGAAGATCAGGAATGGAGAGGCCTTTAAGGAGGAGGAACTAAAAGAGATAATAGCGACCGCGAGGGACATGGAACTCCGCTGGGGTCACTTGTTCGACATGATCATTATTAACAACGACACGCAGCGCGCTTACCAGCAACTGTTGAACGAGATCAACAGTCTGGAGAGGGAACCGCAATGGGTCCCAGCGCACTGGCTCAAACAGACCTAG

Protein sequence:

>DPOGS210401-PA
MTKIVQSLNAYLLDTSACCMNVDRLRQFCLCRGYQKDSNHEIMVSSSPRAQTRSQHGVDPERIKAYSEQLRQRKEAEERIAAQNEFLRTSLRGSHKLQALESNPPSSTAFVNDAYEEDAADEAEQLYALVDYQEVCAALSRVQKSLSSCGEGALAARVSGAAGALLCPALRTALATRSAVLTAVRHKRPGLSPPQTHRATDRLKDTGFNNYKDDKYLPNSPDDSSENIKIIKIEKTNEPLGATVRNEGEAVIIGRIVRGGAAEKSGLLHEGDELLEVNGVSMRGKSVHEVCQVLGGLAGTLSVVLAPRPRPRPPPAYRVLHVRAHFDYDPEDDVYIPCRELGISFQKGDVLHVISREDPNWWQAFREGEEDQTLAGLIPSQAFQHQRESMKLSLAGEAGSAARRSRKGATLLCARARRRKPRKPTSEAGYPLYSAQPDEFEAEEILTYEEVALYYPRASHKRPIVLIGPPNIGRHELRQRLMEDSTRFAAAEQLRQRKEAEERIAAQNEFLRTSLRGSHKLQALESNPPSSTAFVNDAYEEDAADEAEQLYALVDYQEVCAALSRVQKSLSSCGEGALAARVSGAAGALLCPALRTALATRSAVLTAVRHKRPGLSPPQTHRATDRLKDCIDVLGSHTSSGSETSALAAELLSILGGLEVESVIQAHDQAAALLDPSCFNRVKRNKTGFNNYKDDKYLPNSPDDSSENIKIIKIEKTNEPLGATVRNEGEAVIIGRIVRGGAAEKSGLLHEGDELLEVNGVSMRGKSVHEVCQVLGGLAGTLSVVLAPRPRPRPPPAYRVLHVRAHFDYDPEDDVYIPCRELGISFQKGDVLHVISREDPNWWQAFREGEEDQTLAGLIPSQAFQHQRESMKLSLAGEAGSAARRSRKGATLLCARARRRKPRKPTSEAGYPLYSAQPDEFEAEEILTYEEVALYYPRASHKRPIVLIGPPNIGRHELRQRLMEDSTRFAAAVPHTSRARKDHEAAGQDYHFISRAQFEADILNRKFVEHGEYEKAYYGTSVEAIREVVNSGKICVLNLHPQSLRILRGSDLKPYTVFVAPPSLEKLRQKKIRNGEAFKEEELKEIIATARDMELRWGHLFDMIIINNDTQRAYQQLLNEINSLEREPQWVPAHWLKQT-