Monarch geneset OGS2.0

DPOGS210595
TranscriptDPOGS210595-TA3336 bp
ProteinDPOGS210595-PA1111 aa
Genomic positionDPSCF300168 - 310487-318055
RNAseq coverage797x (Rank: top 16%)
Annotation
HeliconiusHMEL0030900.059.65% 
BombyxBGIBMGA013544-TA0.057.63% 
DrosophilaCG1371-PA0.034.86% 
EBI UniRef50UniRef50_E2BYT80.034.85%Nodal modulator 2 n=2 Tax=Formicidae RepID=E2BYT8_HARSA
NCBI RefSeqXP_002061560.10.036.24%GK20961 [Drosophila willistoni]
NCBI nr blastpgi|1954269840.036.24%GK20961 [Drosophila willistoni]
NCBI nr blastxgi|1565499350.034.98%PREDICTED: nodal modulator 1-like [Nasonia vitripennis]
Group
Gene OntologyGO:00302465.4e-10carbohydrate binding
GO:00041805.4e-07carboxypeptidase activity
GO:00054883.7e-06binding
KEGG pathway 
InterPro domain[112-195] IPR0137845.4e-10Carbohydrate-binding-like fold
[121-180] IPR0137831.3e-07Immunoglobulin-like fold
[319-384] IPR0147665.4e-07Carboxypeptidase, regulatory domain
[734-816] IPR0089697.4e-07Carboxypeptidase-like, regulatory domain
[28-108] IPR0089703.7e-06Collagen-binding surface protein Cna, B-type domain
Orthology groupMCL11283 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210595-TA
ATGTTCGAGGGAAATAGTTTTTATGTTCATTTAATTTTCTTATTATCATTAATATCTTCAAGTTACAGTAATGATATCTTAGGATGTGGTGGTTTCGTAAAAAGTCATGCCAGCATAGACTTTTCTAAAATCGAAATCGGCTTATATACTAGAGACGGTAGTCTCAAAGAGAAGACTGAGTGCGCACCCACAAACGGTTACTATTTCTTACCTCTATACGAAAAGGGTGAATATGTTTTAAAAGTTCATCCGCCCGCTGGCTGGAGTTTTGAGCCATCGCAGGTAGAATTAGACATAGACGGTGTGACCGATCAGTGTTCGATCGGCCAAGACATAAACTTCGCTTTCAACGGATTCGGTATAACCGGCAAGGTGATCACCGCCGGTCAGGTCAGCGGCCCCAGTGGTATCAATATACAGCTTGTGAATGAGAAAGGAGAGACCAGAAACACAGTTACAACATCTGGCGGAGATTTCCATTTCACCCCTGTGATACCTGGAAAGTATGTTGTTAAAGCGTCTCACCCTCGATGGAAGTTGGAGCCTGCACATACTGTTGTGCAAGTGAAGGAAGGTAACACCGCTTTGCCTGTGGGGGTTTTAGCCGTTAAAGGCTATGATGTTTCGGGTTCCGCGACATCATTCGGCAGCCCCTTGGGTGGAGTTCATGTACTACTTTACTCCAAAGAGGAGAAACCTAAGTTCCGCGTGGAGGGTTGCAAGACTGCACTTCTTCAAGGTGTTCCGGATGCTCCTATTTGTTATTCAGTCACTGATGCTAATGGGGAGTTTAAGTTTGGTCTCGTGCCAGCTGGAGAGTACAAACTACTAGCTTTGGCCAAGACACCGGGGCAGACCTTCCTCACATACAACATCAAGCCTGATTCGGTGCCGTTCAGTGTACTTCATGATAGCTTGTATATTAGAAATGCTTTTGAGGTTATGGGATTCTCAATCGTGGGTTCCGCTCTGTCAGCTCCGGGTGGTAGTGGTATTGCAGGAGCTCAGGTGCTGTTGGCGGGACAAGCTGTCACCACCACCGACAAGAAGGGGCACTTCACACTCAGTGGACTGAAGCCAGGGGAATACTCACTGACCTTACAACATGAGCACTGCAGCTGGGAGGAGAAGCAGCTGTCTGTGAGTGCGAGCGGTGTGGGGAGTCCCTTGACGGTGGTGGCGTCACGCTGGAAGGTGTGCGGGTCGTTGACCCCACCCGAATCTCGCATCGTGCAGCTGCGTGGACCGAAGGACGAGGACCTCACAACTAAAGCTGATGGTAGCTGGTGTTCCCTGCTGCCCCCGGGCTCGTACTCCGCGCGCGTGTCCGTCACGGAGCAGGAGCAGAGGGACGGCCTCCAGTTCTATCCGGAGGTGCAGCACGTGTCCGTGGGCGGGGCGCCCGTCGGCGGGGTCTCCTTCAGCCAAGTGCGAGCGCGGGTGAGAGGCTCCGTGAACTGCGCCCCGTACTGCCGCGGCCTAAGAGTGGCGCTGCGCCCCCTCACAGCCGACGGGACTTACGCGGGCCCGCCACGCTACGCGAACATCGTCGACGGAGCGTACACGTTCGAGGAGGTGGTCCCTGGCAGCGTGGAGGTGTCAGTGGTGGAGGGCGGCGCGGGCGAGGCGCGGCTGTGCTGGAGGCAGGCCGCGCACAACGTGGTGGTGGCGCAGGACCTGCCGCCCGTCACCGAGTTCACACTCACCGGCCTCGGCCTCGTCATCACCGCCTCGCATGACATGGAGGTGGAGTACACGAGCGTCCACTCCTCGGGCGTGGTGAAGGTGTCTGCGGGCCGCAGCCTGGTGTGTGTGCCGCCCGCCCCTCGCTACACGCTCACCGCCCGCGGCTGTCACCGCGTCTCGCCGCCCACCGTGGACGTCGACATGCAGGGAACGGACATGCCGAGCGCGTCTTTCAAGGCGACGGCGCACGCCTCCACCATCACGATCTCGTCTCCGGAGCGCGCCACGGACGTGAGGTTGCACGTCACCACGGACGGCGGCCCCGCCACCGTGGACCTGCAGCCCGAGGCTCACGGCGACGGGTTCCTCTACACCCACACCATGTACCTGGCCGAGGGAGAGGTGGCGTCCGTGCTGATGGAGTCGTCGACCCTGCTGTCGGTCCCGGGCGGGCGGCAGGATGTGGTGGGGGCGGCGAGCTGCTCCAGGGCGCTCGCCCTCAGGGCGGTTCGAGCCAGGAAAGTCACGGGCCGAGTCGTTCCGCCAGTAGAAGGTGTCACCATCACTCTGCAGGGAGGTGACGTGAAGCTGTCTCAGGTGACCAAAGCCGACGGCCTCTACAGCTTCGGTCCCCTGGACGCGTCCGTGTCGTACAGCGTCACGGCCGAGAAGGAGTCGTACGTGTTCAGTGAGGTGGAGCCCTCGGGAGACGTGCGCGCTCACCGCCTGGCCGAGATACAAGTACAGCTCGTCGACGACAGCAACAACCAGCCGCTAGAGGGGGCGCTGGTGTCCATCTCCGGGGGCTCGTTCCGTCTGAACGCGCTGTCGGCGGCGGGGCGGGTGGCGGCGCGCTCGCTGGCCCCGGCCTCGTACTACGTGAAGCCACACATGAAGGAGTACCGCTTCCAGCCGCCTCACACGCTGCTGGACGTGGCGGACGGACAGACACACACGCTCACCTTCAGAGGTGTTCGCGTGGCGTGGTCAGCCGTGGGGCGCGCCGTGTGTGTGGGCGGCTCGGGGGTCCCCGGGCTGGCCCTCCGCGCTGTGGGGGACTCCGACTGTCACACGCAGGACGCCGTCTGTGATCAGGACGGATACTTCCGCATCCGCGGCCTGCTGCCCGGTTGTACGTATTCCATCCAGCTGAAGGAGTCCTCGGAGCCGGCGCGTCTGGCGGACACGCCGCTCGTCATAAAGATGACGGAGAGTGACGTGCTGGACCTGCGGGTGATCGTGATCCGGCCCCACCAGGTGTCGGACACGCTGGTGCTGGTGCGCTGCTCCAACCCCGACCACTACAAGACCCTCCGCCTGACCCTCAGCCGCGAGTCCTCCTCGCCCGTGTTCTCCACCAAGCTGGACCCGGCGGGCTACTCCCAGGTCAACAACCCCGGCCTGCTGTACCCGCTGCCTCGCCTGCCGGCCGATAATAACTCTTATGTGGTGTCGTTGGAGTCTACCCTGTCCAAGGTGACTCACTCCTATGAAGAGCCGGTGAAGTCGTCGGAGCAGGAGCTCCGCCAGTCGTCGCTGCTGCTGCTGCCGCTGCTGGGAGCGCTGGTGCTGCTGTACTTCCAGAGACACCGCCTGCTGGCTCTCGCACCCGACCCCATCAAGAGAGTATCGCGCAAGAAGACGCAGTAG

Protein sequence:

>DPOGS210595-PA
MFEGNSFYVHLIFLLSLISSSYSNDILGCGGFVKSHASIDFSKIEIGLYTRDGSLKEKTECAPTNGYYFLPLYEKGEYVLKVHPPAGWSFEPSQVELDIDGVTDQCSIGQDINFAFNGFGITGKVITAGQVSGPSGINIQLVNEKGETRNTVTTSGGDFHFTPVIPGKYVVKASHPRWKLEPAHTVVQVKEGNTALPVGVLAVKGYDVSGSATSFGSPLGGVHVLLYSKEEKPKFRVEGCKTALLQGVPDAPICYSVTDANGEFKFGLVPAGEYKLLALAKTPGQTFLTYNIKPDSVPFSVLHDSLYIRNAFEVMGFSIVGSALSAPGGSGIAGAQVLLAGQAVTTTDKKGHFTLSGLKPGEYSLTLQHEHCSWEEKQLSVSASGVGSPLTVVASRWKVCGSLTPPESRIVQLRGPKDEDLTTKADGSWCSLLPPGSYSARVSVTEQEQRDGLQFYPEVQHVSVGGAPVGGVSFSQVRARVRGSVNCAPYCRGLRVALRPLTADGTYAGPPRYANIVDGAYTFEEVVPGSVEVSVVEGGAGEARLCWRQAAHNVVVAQDLPPVTEFTLTGLGLVITASHDMEVEYTSVHSSGVVKVSAGRSLVCVPPAPRYTLTARGCHRVSPPTVDVDMQGTDMPSASFKATAHASTITISSPERATDVRLHVTTDGGPATVDLQPEAHGDGFLYTHTMYLAEGEVASVLMESSTLLSVPGGRQDVVGAASCSRALALRAVRARKVTGRVVPPVEGVTITLQGGDVKLSQVTKADGLYSFGPLDASVSYSVTAEKESYVFSEVEPSGDVRAHRLAEIQVQLVDDSNNQPLEGALVSISGGSFRLNALSAAGRVAARSLAPASYYVKPHMKEYRFQPPHTLLDVADGQTHTLTFRGVRVAWSAVGRAVCVGGSGVPGLALRAVGDSDCHTQDAVCDQDGYFRIRGLLPGCTYSIQLKESSEPARLADTPLVIKMTESDVLDLRVIVIRPHQVSDTLVLVRCSNPDHYKTLRLTLSRESSSPVFSTKLDPAGYSQVNNPGLLYPLPRLPADNNSYVVSLESTLSKVTHSYEEPVKSSEQELRQSSLLLLPLLGALVLLYFQRHRLLALAPDPIKRVSRKKTQ-