Monarch geneset OGS2.0

DPOGS210220
TranscriptDPOGS210220-TA4695 bp
ProteinDPOGS210220-PA1564 aa
Genomic positionDPSCF300196 - 542857-555184
RNAseq coverage687x (Rank: top 19%)
Annotation
HeliconiusHMEL0220720.072.35% 
BombyxBGIBMGA002547-TA0.075.83% 
DrosophilaNrx-IV-PB0.059.98% 
EBI UniRef50UniRef50_Q948870.061.72%Neurexin-4 n=16 Tax=Arthropoda RepID=NRX4_DROME
NCBI RefSeqXP_002030239.10.056.96%GM25331 [Drosophila sechellia]
NCBI nr blastpgi|3123746530.056.60%hypothetical protein AND_15691 [Anopheles darlingi]
NCBI nr blastxgi|3123746530.056.60%hypothetical protein AND_15691 [Anopheles darlingi]
Group
Gene OntologyGO:00071559.1e-25cell adhesion
GO:00002876.9e-22magnesium ion binding
GO:00090596.9e-22macromolecule biosynthetic process
GO:00088976.9e-22holo-[acyl-carrier-protein] synthase activity
KEGG pathwayoaa:1000738640.0 
 K07380 (CNTNAP2)maps-> Cell adhesion molecules (CAMs)
InterPro domain[338-464] IPR0089792.9e-42Galactose-binding domain-like
[1075-1245] IPR0089853.4e-42Concanavalin A-like lectin/glucanase
[1079-1264] IPR0133204.2e-38Concanavalin A-like lectin/glucanase, subgroup
[489-619] IPR0017911.8e-34Laminin G domain
[497-618] IPR0126802.7e-27Laminin G, subdomain 2
[349-459] IPR0004219.1e-25Coagulation factor 5/8 type, C-terminal
[119-264] IPR0082786.9e-224'-phosphopantetheinyl transferase
Orthology groupMCL10237 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210220-TA
ATGATGACCTACTCCGTGTTTTATAATATTTTGCTATTTTGCTACACTCTTTCTTGTACTAAAGCAGAATTAGCAAGATATTATTACGACTACGAATGTAATGAGCCTTTAATCGAGGGAGCTAAACTCACCGCTACTTCCAGTCTGAGAGACAGAGGACCTGAGAATGCAAAGTTGCTTGGACGTCTCATGTTACGGAGATGTGTTCATCTGATGCTGGACATTCCTTATGAAAATATTAGATTTGGAAGGGACGAACACAGGAAGCCATATCTCATAGGTGCTGGCGACATTCCAGCTTACTTCAATGTATCACATCAGGGAGATTATGTTGTTCTAGCTGGTAGCACTAAACAGAATATTGGTGTTGATACCATGAAAATAGAACCACCAGCAAATAAAAACTTGTCAGAGTTCTTTCGACTCATGAAGAGACAAATGTCTGATCATGAATGGTCGACTATATACAGTTATCCCACGGAAGCTCAGCAGATCGCATGTTTTTACAGATTGTGGTGCTTGAAAGAGAGTTACGTAAAGAATATAGGAGTCGGTATAACTGTAGCGCTCCATAAAATAAGTTTTGACATTAAATCACCATTGAAAGTAGGACAAATTGTGACGGACACAAACTTGTATGTGAATAATGTGCTGAAATCCGATTGGAGGTTCGAAGAGACCTTGCTGGATGAGAAACATGCTGTGGCCACTTCACTCCAGGTCGAGGATGAATCTATAGCACCTGCTGTGTACGACTTATTATCTTTTGATGAAATTGTCAAGGAAGCTAAACCATTTGTCGAACCAGATGTCAGTTTGAACGCGTGGACAGCATCAGAGAACGATTTCGATCAGCAGCTGATCATAGACTTGGGGACGGTGAAGAACATCACCCGGGTGGCCACACAGGGCCGGCAACACTCGCAGGAGTTCGTTCAGGAGTATCACATTAGCTACGGTACTAATGGACTCGATTATGTCATGTACAAAGCGCCGGGGGGCGAAGTTAAGAGTAGCCACCATCGACTAACAAGTAGCGCTAACATTGGTGAAGGAAAATCAGCATGGACGACGGTCGAGAGCTCTTACTACCAGCACCTGACGATCAATCTACACTCACGGAAAGAGTTGCGCGGTGTAGCCACCAGGGGGCGCTTCGCCACCGACGAGTACGTCTCGGAGTACATGATACAGTACTCCGATGACGGTGAGACCTGGACCGCGGTCGCTGACGCGGACGGCTACACGCAGATGTTCGAAGGCAACCATGATGGCAACACTGTAGTCAAGAACGAGTTCGACGTTCCTATCATAGCTCAGTACATCAGGATCAATCCCATGAGGTGGAGGGACAAGATATCCATGAGGGTCGAGCTGTACGGGTGTGATTATGTTGCTGATACGTTGTTCTTCAACGGGTCGTCCCTGGTTCGTATGGACCTGCTTCGCTCCCCCGTGTCCTCCTCCCGCGAGGCCATCCGTTTCCGGTTCAAGACCTCTGCCGCTTCCGGCGTGCTGTTATACTCTCGCGGCACCCAGGGCGACTACCTCGCGCTGCAGCTCCGAGACAACCGCCTCGTCCTCAACATAGACCTCGGGTCGGGGAAGGCGACGTCGTTGTCGGCGGGCAGCTTGTTGGACGACAACACTTGGCACGACGCGCTGGTGTCCCGCGCCCGCCGCGACCTCGTGTTTTCGGTGGACCGCGTGGTCATGAGGGCGCGGATCAAGGGCGAGTTCTCCCGACTCAACCTCAACAGAGCGATTTATATCGGCGGGGTGCCAAATTTCCAAGAGGGTCTGGTGGTGACACAGAACTTCACAGGGTGTATAGAGAACATGTATCTGAACGCCACCAACGTCATCCAGGAGCTCAAGATGGGGTACGAGGCCGCGGAGCCCTTCAAGTATCAGAAAGTCAACACCTTGTATTCCTGCCCTGAGCCGCCGGTCGTGCCGATCACTTTCCTGAAAGAGGGCTCGTACGCCAAGCTGCGCGGCTACGGCGGGGGCGGCGTGCTCAACGTGTCGCTGGAGTTCCGCACGTACGAGCACCACGGACTGCTCGTATACCATCAGTTTAAAAGTGAAGGATACGTCAAGGTGTTCCTTGAGGAGGGCAAGGTGAAGGTGGAGTTGTTCACGGAGGGGTCTCCCAAGGTGAAGCTGGACAACTTCGAGGACACCTTCAACGACGGTCGCTGGCACGCTCTCATGCTGACCATGGCCCAGGACAGCCTCACCCTGTCTCTCAACTACAGGGCCGTCAGAACCAGCAAGAAGATGAAGTTCTTCACCGGGGGCTACTACTACATAGCAGGTGGCAAGGCGCCGCCCCGTGGGTTCGTGGGCTGTATGCGGAAGCTGGCCGTGGACGGCAACTACCGCTCGCCCACGGACTGGACGCGCGAGGAGTACTGCTGCCCCGACGAACTCGTGTTTGACGCCTGCCACATGATAGACAGGTGTAACCCCAACCCGTGCGAGCACGGCGGCGTGTGTACGCAGAGCGCGGACGAGTTCGCCTGCGACTGCACCGACACCGGCTACGCGGGCGCCGTCTGCCACACGTCGATCCACCCCGTGTCGTGTGCGGCGTATGCGTGGTCGGGGGCCACGGGGCGGCGCTCCACCCGGGTGCTGTTGGACGTGGACGGCTCGGGTCCCCTGCCGCCCTTCCCCGCCACCTGCCACTTCTACGCGGACGGTCGCATCATAACGTCGGTGCAGCACTCGGCGGTGTCCAGCACGCAGGTGGACGGCTTCCAGGAGGCGGGCAGCTTCAGGCAGGACGTGACGTACGACGCAACGCGGCCGCAGCTCGAGGCGCTGCTCAACAGGAGCCACTCGTGCAGCCAGCGATTGGAGTACATGTGCCGACACTCCAGACTGCTCAATTCGCCCAGCGAGGAGGCCACGTTCCAGCCGTTCGCGTGGTGGGTGTCTCGCAGCGGGCAGCGGATGGACTACTGGGCGGGGGCGCAGCCAGGATCCCGCATGTGTGAGTGCGGGGTGCTCGGCACGTGTCTCGACCCTACCAAGTGGTGCAACTGTGACGCCGAGCACTCGCCCATGCCTCACGACGAGTTTCAAACTGACGGCGGAGACATCACGGAGAAGGAGTTCCTGCCGGTGAAGCAACTTCGCTTCGGTGACACGGGCAGCCACCTCGACGAGAAGATAGGGAGGTACTCGCTGGGACCCCTGCTGTGCGAGGGAGACGACCTGTTCTCTAACGCCGTCACGTTCCGTATCTCGGACGCGGTCATCACCCTGCCGACCTTCGACCTGGGTCACAGTGGAGACATCTACTTTGAATTCCGGACCACTAAAGAAAACGCTGTCCTGTTACATTCGAAGGGCACTCAAGACTATATAAAGCTGTCAATAATCGGCGGAGACCAGCTGCAGTTCCAGTTCCAGGTGGGGGACACGCCCCTCGGGGTCTCCGTGGAGACCAGTAACCGGTTGGCCGACGACCAGTGGCATTCCGTCTCTATCGAGAGGAACAGGAAGGAGGCCCGCGTAGTTGTGGACGGAGCCTTGAAGAACGAGATACGAACGGCCAAGGAACCGGTCCGCGCCCTCCAGCTAGCCACCCCGCTGGTGCTGGGGGCCAGCCTCGACAGGAAGGACGGGTTCGTGGGCTGCATGAGGGCGCTCCTACTGAACGGACGGCCTGTGGACCTGCGGGGACACGCCAGGAGAGGTCTGTACGGCGTGTCGGAGGGCTGCGTGGGCAAGTGCTCGTCATCGCCGTGTCTGAACAACGGCACGTGTCTGGAGCGGTACGACTCGTACTCCTGCGACTGCCGCTGGACCGCCTTCAAGGGCCCCATCTGTGCTGACGAGATCGGCGTCAACCTCCGCCCCAACTCCATGGTGAAGTACGACTTCCTGGGCTCGTGGCGCTCCACCATCAACGAGAAGATCCGCGTGGGCTTCACCACCACCAACCCCAAGGGCTTCCTGCTCGGCTTCTACTCCAACATATCCGGGGAGTACCTCACGCTCATGGTCTCCAACTCAGGTCACCTGCGCGTGGTGTTCGACTTCGGGTTCGAGAGGCAGGAGATCATCTTCGAGGGGAAGCACTTCGGCCTGGGACAGTACCACGACGTGCGCCTCTCCAGGAAGGACAGCGGCGCTACCATGGTGCTGCAGGTGGATAACTATGAGACCCAGGAGTACCAGTTCAACATCCGCGAGTCTGCGGACGCGCAGTTCAACAACATCCAGTACATGTACGTGGGCAGGAACGACTCCATGGCCGAGGGCTTCGTGGGCTGCGTCAGTCGCGTGGAGTTCGACGACATCTACCCTCTCAAGCTGCTGTTCCAGCAGGACCCGCCGCCCAACGTCAGGAGCATCGGCGGTCCCCTGCACGAGGACTTCTGTGGCGTGGAGCCGGTGACGCACCCGCCCGTGATCCCGGAGACCCGGCCGCCGCCGCCCGCCGACCTCGCCGCGGACCTCGACTTCCACCGGACCGACGAGGCCATACTAGCCACGGTGCTGGCGTTCGTGTTCCTGCTGCTGATAGCGGTGGCCGTGGTGCTGGTGAGGGCGCTGTCCCGCCACAAGGGAGAGTACCTCACACAGGAGGAGCGCGGGGCGGCGGGAGCGGCGGGGCCTGACGACGCCGCGCTGGCGGCCGCCACGGGGGCCCGGGTCACCAAGCGGTTCTTTATATAG

Protein sequence:

>DPOGS210220-PA
MMTYSVFYNILLFCYTLSCTKAELARYYYDYECNEPLIEGAKLTATSSLRDRGPENAKLLGRLMLRRCVHLMLDIPYENIRFGRDEHRKPYLIGAGDIPAYFNVSHQGDYVVLAGSTKQNIGVDTMKIEPPANKNLSEFFRLMKRQMSDHEWSTIYSYPTEAQQIACFYRLWCLKESYVKNIGVGITVALHKISFDIKSPLKVGQIVTDTNLYVNNVLKSDWRFEETLLDEKHAVATSLQVEDESIAPAVYDLLSFDEIVKEAKPFVEPDVSLNAWTASENDFDQQLIIDLGTVKNITRVATQGRQHSQEFVQEYHISYGTNGLDYVMYKAPGGEVKSSHHRLTSSANIGEGKSAWTTVESSYYQHLTINLHSRKELRGVATRGRFATDEYVSEYMIQYSDDGETWTAVADADGYTQMFEGNHDGNTVVKNEFDVPIIAQYIRINPMRWRDKISMRVELYGCDYVADTLFFNGSSLVRMDLLRSPVSSSREAIRFRFKTSAASGVLLYSRGTQGDYLALQLRDNRLVLNIDLGSGKATSLSAGSLLDDNTWHDALVSRARRDLVFSVDRVVMRARIKGEFSRLNLNRAIYIGGVPNFQEGLVVTQNFTGCIENMYLNATNVIQELKMGYEAAEPFKYQKVNTLYSCPEPPVVPITFLKEGSYAKLRGYGGGGVLNVSLEFRTYEHHGLLVYHQFKSEGYVKVFLEEGKVKVELFTEGSPKVKLDNFEDTFNDGRWHALMLTMAQDSLTLSLNYRAVRTSKKMKFFTGGYYYIAGGKAPPRGFVGCMRKLAVDGNYRSPTDWTREEYCCPDELVFDACHMIDRCNPNPCEHGGVCTQSADEFACDCTDTGYAGAVCHTSIHPVSCAAYAWSGATGRRSTRVLLDVDGSGPLPPFPATCHFYADGRIITSVQHSAVSSTQVDGFQEAGSFRQDVTYDATRPQLEALLNRSHSCSQRLEYMCRHSRLLNSPSEEATFQPFAWWVSRSGQRMDYWAGAQPGSRMCECGVLGTCLDPTKWCNCDAEHSPMPHDEFQTDGGDITEKEFLPVKQLRFGDTGSHLDEKIGRYSLGPLLCEGDDLFSNAVTFRISDAVITLPTFDLGHSGDIYFEFRTTKENAVLLHSKGTQDYIKLSIIGGDQLQFQFQVGDTPLGVSVETSNRLADDQWHSVSIERNRKEARVVVDGALKNEIRTAKEPVRALQLATPLVLGASLDRKDGFVGCMRALLLNGRPVDLRGHARRGLYGVSEGCVGKCSSSPCLNNGTCLERYDSYSCDCRWTAFKGPICADEIGVNLRPNSMVKYDFLGSWRSTINEKIRVGFTTTNPKGFLLGFYSNISGEYLTLMVSNSGHLRVVFDFGFERQEIIFEGKHFGLGQYHDVRLSRKDSGATMVLQVDNYETQEYQFNIRESADAQFNNIQYMYVGRNDSMAEGFVGCVSRVEFDDIYPLKLLFQQDPPPNVRSIGGPLHEDFCGVEPVTHPPVIPETRPPPPADLAADLDFHRTDEAILATVLAFVFLLLIAVAVVLVRALSRHKGEYLTQEERGAAGAAGPDDAALAAATGARVTKRFFI-