Monarch geneset OGS2.0

DPOGS201864
TranscriptDPOGS201864-TA3009 bp
ProteinDPOGS201864-PA1002 aa
Genomic positionDPSCF300191 + 77157-85827
RNAseq coverage628x (Rank: top 20%)
Annotation
HeliconiusHMEL0069900.065.33% 
BombyxBGIBMGA006039-TA0.073.75% 
DrosophilaNFAT-PB2e-11450.92% 
EBI UniRef50UniRef50_UPI00020643E31e-12952.14%UPI00020643E3 related cluster n=2 Tax=unknown RepID=UPI00020643E3
NCBI RefSeqXP_391906.33e-12851.31%PREDICTED: similar to NFAT CG11172-PA [Apis mellifera]
NCBI nr blastpgi|3838646041e-12950.50%PREDICTED: uncharacterized protein LOC100881293 [Megachile rotundata]
NCBI nr blastxgi|3838646045e-12535.44%PREDICTED: uncharacterized protein LOC100881293 [Megachile rotundata]
Group
Gene OntologyGO:00063551.8e-145regulation of transcription, DNA-dependent
GO:00037001.8e-145sequence-specific DNA binding transcription factor activity
GO:00056342.5e-67nucleus
GO:00055151.8e-13protein binding
KEGG pathwayame:4083547e-128 
 K04446 (NFATC, NFAT)maps-> Axon guidance
    Wnt signaling pathway
    Natural killer cell mediated cytotoxicity
    T cell receptor signaling pathway
    B cell receptor signaling pathway
    VEGF signaling pathway
InterPro domain[231-515] IPR0083661.8e-145Nuclear factor of activated T cells (NFAT)
[228-405] IPR0115392.5e-67Rel homology
[227-412] IPR0089675.7e-59p53-like transcription factor, DNA-binding
[406-507] IPR0137836.3e-30Immunoglobulin-like fold
[411-509] IPR0147563.1e-25Immunoglobulin E-set
[411-507] IPR0029091.8e-13Cell surface receptor IPT/TIG
Orthology groupMCL13605 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201864-TA
ATGAGCGCCCGCGTTCACCGGAAAGTGATGCGCGCGCCTCATAAGCGAGCGCATCCGGGGAAAATGCTCCACGCCGGGAAACTGGTGCACCCGGGGAAGGGAATCCACCCGGGCAAGTTCGCACATGTGGGGAAGTTCGGGAAACTCGGTCACTACACGCACACGCACTCGTTGCGTCCACCGGAGCCCTGCGACAATAGTAACGACAGCGGTCTGGGGCCCGACCCCGTCAACAGATTGTCGGAGGTGGCCGAGGAGTGGGAAGAGCCGGAAACAAAGAGGCGGTGCGAGAGAGAGACGGGTGTTAAGATAGAGTGCGATGACGCTAACGATGCGTACGCGTTCGCTCTACCCGCCGCGCCCTCGCACGCGCCCAGCGCCCCTGACACTGCGCCCGCTACACTACCGATTCCCGCAATCAAAGCCGGTTCCAGCATTTCTAGTCCAAGCATAGCAGAGACGACTGGCAAAAGATTTCAAAATCCAACGTATTGGTCGTACGGGAAACTGGAGAGTAGCGGCAAGCTAGCCAATAAGTATGCCAGTGGTGGCAAATTAGGGGGAAAATTGGGTGGTAAATTATCAGGGAAGCATGGTGGAAAATTAGCCCAGGCATTGGCGCGACGAAGAGCGCTGGCAGCTATGACACACTCAGGGCCGCCGGGCTTATCGGCGCCGCTCTCATGCAAATCTAGAGATGGGACGGTTGAGCTGCAAATACTATGTCAACCGGAGACACAGCATAGAGCGAGATATCAGACCGAAGGAAGTCGAGGAGCAGTTAAAGACAATTCTGGGAATGGCTTCCCCGTTGTGAAACTTGTCGGTTATGACAAGCCAGCCGTGCTTCAGGTATTCATAGGTACTGATACAGGACGCGTCGCCCCTCATATGTTTTACCAGGCGTGCCGCGTCTCTGGCAAGAACTCTACGCCGTGCAAAGAAAGGAAAGAAGATGGGACTGTTGTCATTGAAATTGACTTGGAACCGGCAAAAAACTGGCAGGTCACTTGCGACTGCGTAGGAATTCTGAAGGAACGTAACGTGGATGTAGAGCACAGGTTCGGCGAGGCCCTGGGCGGTGGGGCGGGCGGGGTGCACGCGGCGCGCGGGAAGAAGAAGTCCACAAGATGCCGCATGGTCTTCCGCACCGAGATACTGGACTCCAACGGACAAACAGAGACCCTGCAAGTCTGCTCCACACAGATCATATGCACCCAGCCTCCCGGAGTACCGGAAGTGTGCAGGAAGTCGCTGGTGTCGTGTCCGGTGACGGGCGGCTTGGAGTTGTACCTGCTGGGGAAGAACTTCCTGAAGGAGACCAGAGTGGTGTTCCGCGTCAAACAAGACGGAGTCACCTGGGAGGAGGAGGTCGTTCCGGACAAGGAGTTCTTGCAACAGACCCACCTGGTGTGCTGCGTGCCGGCGTACTCGCGCCCCGACATCCAGGAGGCGGTGTGCGTCCAGTTGTTCGTGCGCTCGGGCGGCAAGTCCTCCGAGCCGCACGCCTTCTACTACACGCCGGCCGGGGCGCGCGCGATGCACTGCACCCAGCACCCCGCGCACACCCCCCTCACCCCCCACACCCCTCACACAGGTGAGGCTGCTCTCATGCCGCCGCCTCTCGCCCCCCTCGCGCCTCTCCCGCCCGCTCGCCGGACCTCTCTGCTGCACGACCCTCACTCTCCGCTGGGACTCAAGAGCGAGGTGGACGAGTCCAGCCAACACTCGCTGCTGGAGGGCGAGCGCTCGGAGCCCGACGCGCCGCTCGCCGATGACATCATGGACCTGCGGCTCAAGTCGGAGACCATCACCTGCGACTCGCAGACACAGGTGGGTTTCGTGAGCGGTTACGACTCTATCAAGCTGTCTCCGAACACTACGTCGCGGGACGAGTCCCCGTCCGTGATCGCGTCTTTCACACAGCAGCTGCAGGCCATACAGAACCAGGTGCAGACGGACAAGATGGTCGAGTCCGTGACCGCCGCTATCTTCAACTCTGACAACGCCGGCCAGATGTACGAGCAGCCGCTGCTTCCCATCAACACCATGGACACCATGCAGCGGATCATGTCTGCCAAGACCGCCATGGACCCGCTCGACCGAGACATGAAAGTACTGAACTCAGACCTGATGATGACCGGCGACCCCATGCAGACCAGTGTACTGCAGACTGCCGGTGAACAGCGACTCATGGTGTACGATCAAGTGCCGCCAGCCCGCGAGGACGGCTTTAACCCTTTCGGAGCCATCGGCAAGATGGAGGCGACTCAGATAAAACAACGGCTGGCGCAGCAAACCGCGCACATGGACGCGCTCGTGGAGGACGCCATGCGCTCCGCCGGCGCCACCATCATGCCCGGCGACGCCGCCAAGCTCGACGAGCTGGTGAACTCGCGAGTAGAAGACCACCTGGGCGGCACGGGGACCTCGCCTTCCGGCGCTTCGCACGCTTCCGACGTGCTGCTTAGTCCCGGGGCGGCGGTCGTGCCCCGCACCTCCGACCTCCTGCTGCCGCTGGCCGCCACCACCATGTCCCCGGACGTGATCCTCGATCCTCAAGTGTCCCCCTCGATGCTATGCGACTCCTCGCAGCGTATCGTGCTCCCGCCGCGGTCGCAGGACGAGCTGATGATGATGCCGGACATCCCGTCGTCCGTGAAGACGCCGCCGGCCGCCGTCAAGTCTATGATCCTGAACGCGGCCGCCGAGATTCTGACCTCGGACCGCGCCATGAACGCGCTCGTCACTTCCGCCATCAACACAGCCAACATGGCGGCGGCGGACGCGGCGCCGGCGGACGAGCCCGCGGCCGCCATGTCGCAGGCCGTGTCGCAGGCCGTCACTCAGGCGGTGTCGCAGGCCGTGTCGCAGGCGGTGTCGCAGGCCGTGTCGCAGGAGATGACTGCGCCCGTGCAGGGCCTCACGGACATGAGCGACCAGGACCTGCTGTCCTACATCAACCCCAGCACCTTCGACCAGGGTGAGTACTACATCGACCTGTTCTAG

Protein sequence:

>DPOGS201864-PA
MSARVHRKVMRAPHKRAHPGKMLHAGKLVHPGKGIHPGKFAHVGKFGKLGHYTHTHSLRPPEPCDNSNDSGLGPDPVNRLSEVAEEWEEPETKRRCERETGVKIECDDANDAYAFALPAAPSHAPSAPDTAPATLPIPAIKAGSSISSPSIAETTGKRFQNPTYWSYGKLESSGKLANKYASGGKLGGKLGGKLSGKHGGKLAQALARRRALAAMTHSGPPGLSAPLSCKSRDGTVELQILCQPETQHRARYQTEGSRGAVKDNSGNGFPVVKLVGYDKPAVLQVFIGTDTGRVAPHMFYQACRVSGKNSTPCKERKEDGTVVIEIDLEPAKNWQVTCDCVGILKERNVDVEHRFGEALGGGAGGVHAARGKKKSTRCRMVFRTEILDSNGQTETLQVCSTQIICTQPPGVPEVCRKSLVSCPVTGGLELYLLGKNFLKETRVVFRVKQDGVTWEEEVVPDKEFLQQTHLVCCVPAYSRPDIQEAVCVQLFVRSGGKSSEPHAFYYTPAGARAMHCTQHPAHTPLTPHTPHTGEAALMPPPLAPLAPLPPARRTSLLHDPHSPLGLKSEVDESSQHSLLEGERSEPDAPLADDIMDLRLKSETITCDSQTQVGFVSGYDSIKLSPNTTSRDESPSVIASFTQQLQAIQNQVQTDKMVESVTAAIFNSDNAGQMYEQPLLPINTMDTMQRIMSAKTAMDPLDRDMKVLNSDLMMTGDPMQTSVLQTAGEQRLMVYDQVPPAREDGFNPFGAIGKMEATQIKQRLAQQTAHMDALVEDAMRSAGATIMPGDAAKLDELVNSRVEDHLGGTGTSPSGASHASDVLLSPGAAVVPRTSDLLLPLAATTMSPDVILDPQVSPSMLCDSSQRIVLPPRSQDELMMMPDIPSSVKTPPAAVKSMILNAAAEILTSDRAMNALVTSAINTANMAAADAAPADEPAAAMSQAVSQAVTQAVSQAVSQAVSQAVSQEMTAPVQGLTDMSDQDLLSYINPSTFDQGEYYIDLF-